I’m building an e-commerce service for a group of sellers. They have a common HQ who manufactures their product.
- order (id, seller_id, timestamp)
- order_products (order_id, product_id, seller_id, timestamp, pincode)
- transaction (id, seller_id, timestamp)
- transaction_products (transaction_id, product_id, seller_id, timestamp, pincode)
- seller (id, pincode, name)
- product(id, price)
- There are 100 sellers
- Each vendor performs 500 transactions per day
- Each transaction has 4 products associated with it
- Each Vendor places two orders per day to HQ
- Each order have 50 products
- How many products were sold by which seller in a given month
- How many products were sold in a given pincode in a given month
- Orders placed by all sellers in a given month
- View cost of order placed by him/her (the seller)
- View his/her sales of a given month
The product is ready and application works just fine. But, I’m concerned with the two things.
- Scaling: Being really new, I don’t know much about scaling out or sharding or clustering. How much time have I got until I can keep these aside?
- Redundancy: As you can see in transaction_product & order_product, I’ve reused columns from transaction & order, respectively. The redundant columns are:
pincode. My idea was to avoid joins. But I’m not sure if joins would be more expensive than current redundancy. Can anyone point me in the current direction?
Go to Source
Author: Koushik Shom Choudhury