Read-only database: Normalize or not for best query performace

I have a pandas DataFrame that looks a bit like this:

         id        name       date     col1     col2  total
0 123456748 EXAMPLENAME 2020-09-01 SOMEDATA MOREDATA   5.99
1 123456748 EXAMPLENAME 2020-09-01 SOMEDATA OTHERDATA 25.99

There are 15 columns, the name values are associated with the ID, and the rest is some data relevant for that person. col2 would have about 400 unique values. The database would be about 300,000,000 rows to start with, and then will grow at about 500,000 records per week.

The records in the database will never be updated or deleted, only new ones will be added. Final purpose of the database is to have a web app in which the user can select the ID of the person. The database would retrieve the information, and the website would render a graph and a dataframe. The expected traffic of the website is very low, so I was thinking about using SQLite.

Based on that, I have two questions:

  1. Should I use Relational Databases, like PostgreSQL or SQLite, or should I try MongoDB? I’m interest on the performance of the database to select and retrieve the data; don’t care too much about insert time as it won’t be done very often (once per week).
  2. Based on performance to query, in case you select Relational Databases, should I have all data in one table or should I split it (normalize it)? I read that normalizing a database when its purpose is only to query and store the data, could lead to worse performance than having it all in one table. However, I do not know much about databases and would prefer an expert opinion, or resources to learn more about the correct implementation and maintenance.

Thanks.

Go to Source
Author: Jose Vega

Multiple intersection tables vs multiple joins

I have a hierarchical relationship between my tables, with the children having foreign keys referring back to their parent ids (assuming id is the primary key for each table):

Department has many Category Groups
Category Group has many Category(-ies)
Category has many Sub-Category(-ies)
Sub-Category has many Attributes.

Now, all these entities except for Attributes are optional meaning if I don’t select anything on my hierarchical cascading dropdown based UI, I need to display the Attributes that belong to all Departments, if I only select a Department then I need to display Attributes that belong to all Category Groups belonging to that Department and so on.

Obviously, one option to implement it is to do a inner join between all the tables to get to Attribute. For instance, if nothing is selected it will be:
Department inner join Category Group
inner join Category
inner join Sub-Category
inner join Attribute
to show all the attributes belonging to all departments.

The other thought in my head is to have intersection/relation mapping table(s) –
DepartmentAttributeRelation which has foreign keys to Department and Attribute,
CategoryGroupAttributeRelation which has foreign keys to CategoryGroup and Attribute and so on.
This will enable direct search to get to the Attributes given any entity.

My question is – Are there any downsides to the second approach above or are there any better approaches to solve this?

Go to Source
Author: linuxNoob

Restaurant reservations – Tables combinations

I have 3 tables in a reservation system for a restaurant. So far the software was used only by restaurant’s staff, however we want to accept reservations online as well. We have some small tables for 2 that can be easily moved to each other and make room for bigger parties and I want to accept reservations automatically if all of the tables that can be combined are available.

tables: holds all tables for each area in the restaurant.

| id | min_capacity | max_capacity | name | area   |
|----|--------------|--------------|------|--------|
| 1  | 2            | 4            | #1   | Inside |
| 2  | 6            | 8            | #2   | Inside |

reservations: holds reservation details

| id | datetime            | name     | status   |
|----|---------------------|----------|----------|
| 1  | 2020-09-01 20:00:00 | John Doe | Upcoming |
| 2  | 2020-09-05 13:00:00 | Jane Doe | Upcoming |

And one pivot table that holds reservation <=> table relation:

| id | table_id | reservation_id |
|----|----------|----------------|
| 1  | 1        | 1              |
| 2  | 2        | 2              |

How can I store different combinations of tables (manually entered) and “attach” reservations to tables/table combinations (so I can check if tables are available for specific time) efficiently?

Go to Source
Author: Clarissa

What is the best database design for storing survey form with different types of questions and answer formats and branching is possible?

I would like to store the format of the survey form which can branch into different question based on
Questions can be video, audio, text and answer can be text, multiple choice, video, audio, geolocation etc. Also based on the answers of a question branching into different question should be possible. It should also be possible for user to fill the form in multiple session so some state should also be there. So the answers to the columns can be missing due to branching as well as the response being incomplete. There is a need of fast filtering and analysis of the database. Also, it should be possible to extract all the responses of a particular form in CSV file. What would be the best implementation for this problem?

Go to Source
Author: Shrey Paharia

SQL Server Unique Constraint on two columns with an exception

Hi all and thanks for your advice.

Expense(SupplierID(Foreign Key), DocumentID(vchar))

I understand how to add a simple unique constraint on two columns. However, if DocumentID = ‘NA’, I would like to ignore the rules of the constraint.

Some suppliers in our system do not provide an invoice id, for example. Therefore, I leave the field NULL. I would like to remove all nulls for the field ‘DocumentID’ to avoid accounting for the NULLS in my client code.

I am new to SQL Server, but I could figure out how to do this using a trigger. The reason I’m asking here is to see if there is a better way to respond to this scenario or to avoid it by a different design.

Thanks!

Go to Source
Author: Tom Schreiner

Scaling out MySQL & Redundancy-Speed tradeoff?

I’m building an e-commerce service for a group of sellers. They have a common HQ who manufactures their product.

Tables:

  1. order (id, seller_id, timestamp)
  2. order_products (order_id, product_id, seller_id, timestamp, pincode)
  3. transaction (id, seller_id, timestamp)
  4. transaction_products (transaction_id, product_id, seller_id, timestamp, pincode)
  5. seller (id, pincode, name)
  6. product(id, price)

Specifications:

  1. There are 100 sellers
  2. Each vendor performs 500 transactions per day
  3. Each transaction has 4 products associated with it
  4. Each Vendor places two orders per day to HQ
  5. Each order have 50 products

HQ Requirements:

  1. How many products were sold by which seller in a given month
  2. How many products were sold in a given pincode in a given month
  3. Orders placed by all sellers in a given month

Seller Requirements:

  1. View cost of order placed by him/her (the seller)
  2. View his/her sales of a given month

The product is ready and application works just fine. But, I’m concerned with the two things.

  1. Scaling: Being really new, I don’t know much about scaling out or sharding or clustering. How much time have I got until I can keep these aside?
  2. Redundancy: As you can see in transaction_product & order_product, I’ve reused columns from transaction & order, respectively. The redundant columns are: timestamp, seller_id, pincode. My idea was to avoid joins. But I’m not sure if joins would be more expensive than current redundancy. Can anyone point me in the current direction?

Go to Source
Author: Koushik Shom Choudhury

common columns in all tables in mysql

I want to create a table like base_table with below columns –
id, created_at, created_by.

and for all other tables, I want created_at and create_by columns available through inheritance.
I don’t want to create these common columns in all other tables.

Go to Source
Author: zeeshank1