## Using a different algorithm depending on the size of the input

I recently finished a course on advanced algorithms, and another on complexity & computability theory, and in the past few days my mind has been somewhat preoccupied by this question.

Why don’t we just use a different algorithm based on the size of the input?

I’m asking this question because I’ve never seen this done in practice or heard of it, and I’m also simply curious about the answer. I also tried looking it up on StackExchange and Google with various queries but couldn’t come up with anything remotely related to my question.

I’ll take the example of sorting algorithms, as they’re quite common and there are so many, with different properties and runtime complexities.

Say I have three algorithms, `SortA`, `SortB` and `SortC`. `SortA` is incredibly efficient on inputs of size <= 100 but becomes very slow on inputs that are any bigger; `SortB` is more efficient on inputs of length > 100 than `SortA` but falls off quickly after a size of 1000. Finally, `SortC` isn’t very fast on inputs of size < 1000, but is faster than `SortA` and `SortB` on very large inputs.

Why shouldn’t/couldn’t I make a function like this (written in pseudo-C#-ish code for simplicity)? Or why isn’t it done in practice?

``````int[] Sort(int[] numbers) {
if (numbers.Length <= 100) {
return SortA(numbers);
}
else if (numbers.Length <= 1000) {
return SortB(numbers);
}
else {
return SortC(numbers);
}
}
``````

I’m assuming some of the potential reasons are that

1. it’s more code to write,
2. more potential bugs since there’s more code,
3. it’s not necessarily easy to find the exact breakpoints at which some algorithm becomes faster than another, or it might take a lot of time to do so (i.e. running performance tests on various input sizes for every algorithm),
4. the breakpoints could only be on small or medium-sized input, meaning there won’t be a significant performance increase that is worth doing the additional implementation work,
5. it just isn’t worth it in general, and is only used in applications where performance is crucial (similar to how some numerical algorithms use a different method to solve a problem based on the properties of a matrix, like symmetry, tridiagonality,…),
6. input size isn’t the only factor on an algorithm’s performance.

I’m familiar with Landau/Big O notation, so feel free to use it in your answers.

Go to Source
Author: cliesens

## What is the best way to deploy a short living process when you have no machine?

I’m new to devops .

I created a converter and want to deploy it . The converter converts a 3D model from a format to another and then you’ll be able to visualize the outpout file on the platform and then you can download the file if you want to.

when bench-marking the process , i found out that for now it runs up to 1 minute when the files are really big , I’m using azure just moved to aws this week . For now the converter converts one file at a time and uses blender’s python library and a c++ library ( when trying to use docker i built these inside the container )

I started by creating a docker container that will read a heavy blob and then output the converted file but figured out that docker is not designed to read local files .

I’m searching for the right model to host this . Is docker a good solution ? If not , is there other ways to do this ?

Go to Source
Author: tawfikboujeh

## metrics – how to measure software performance [closed]

I have to study software metrics for a competition, I’ve found a lot of things but I’m really confused, could you suggest me which metrics are used to measure performance in software systems and when you should prefer one to another?

Go to Source
Author: Mark

## Is a transaction time of <10ms for an SQL database viable? If so, under what conditions?

Appreciate this is a rather odd question, so I will try to clarify as much as possible. Please also be assured this is a question purely for my own education, I’m not about to rush off and do crazy things in our software on the back of it.

I have a customer requirement for a transaction time of <10ms on a system that is based around an SQL database – in our specific implementation it is Oracle DB. I’m aware that this is not a useful or meaningful requirement, so with my business hat on I’ll be dealing with that. I fully expect that the requirement will be revised to something more useful and achievable.

However, I am curious on a technical level. Could you squeeze transaction time on an SQL DB down below 10ms? Lets be generous and say this is pure SQL execution time, no comms, no abstraction layers etc. Right now, running `select 1 from dual` on one of our systems gives a reported execution time of 10-20ms and I’d assume that’s about the simplest query possible. What if anything might you do to reduce that time (a) within Oracle/SQL or the server environment (b) by making a different tech choice? I’d assume maybe a higher clock speed on the CPU might help, but I wouldn’t bet on it.

Go to Source
Author: SimonN

## SQL Server Slowest Query is NULL

I am looking at both the SQL Server expensive queries report and the query below, but both are showing this mysterious `NULL` query as the slowest query on my server.

Is there any way I can find out more about this `NULL` query and why it might be so slow?

Is this some internal query? It doesn’t seem like this should be showing up in the report if so.

This is the query which is also showing `NULL` as the slowest query on my server:

``````select
r.session_id,
r.status,
r.command,
r.cpu_time,
r.total_elapsed_time,
t.text
from
sys.dm_exec_requests as r
cross apply
sys.dm_exec_sql_text(r.sql_handle) as t
``````

How can I find out what this query is and why it’s so slow?

Go to Source
Author: user1477388

## Load Generators Giving 100% utilization – Storm Runner

We are trying to hit our AUT(Application Under Test) with almost 5000 users. We have tried twice but once almost 2500 – 3200 users are in the application, Both of the Load Generators spike to 100% utilization.
We are using Storm runner for performance testing the application.The Protocol of scripts is Multi Protocol:- Web HTTP/HTML and Oracle NCA.
The configuration of both the Load Generator:
16 core processor. 32GB Memory.
LG1 has processor Platinum 8272CL
LG2 has processor E5-2673 v4

Please do let us know how to resolve this or else is there any way out to calculate accurately as in how many Load Generator do we need. We have also done a lot of research by us but it fails astonishingly in main test.

Go to Source
Author: Aashish Sharma

## Best archtitecture and methods for high performance computing that needs to scale

I have to make a decision regarding architecture and methods for the rewrite of a proof of concept application I wrote 10 years ago in c++….

It’s about high performance position calculation based on multi-trilateration.
Hunderts, thousands of IoT Sensors are sending it’s JSON based distance information to a host by using MQTT. From there the information needs to be processed.

My goal is to rewrite it, so it will get more real-time, scalable and run the position-solver-application in the cloud or on-premises with utilizing the cpu as efficient as possible by using all of the cores / threads.

If you start from scratch which architecture, language and methods would you use?
E.g.

GoLang ? C++ with threads? Rust? Python?
Architecture ?
Docker?
GPU support?

some metrics:
up to 10.000 sensors are sending distance 200 JSON messages per second to the MQTT Broker

(In my proof of concept there were just 20 sensors and 5 messages per second)

Any recommendation?

Will be a open-source project by the way.

Best regards,
//E

Go to Source
Author: Ersan

## Foreign keys to primary tables or nested table

In this hypothetical example should the foreign key constraint setup for the `ProductId` and `UserId` columns in the `ProductUserCommentAction` table be referencing the Product/User tables as shown in the first diagram OR is it OK for those columns to reference the `ProductUserComment` table as shown in the second diagram?

I like how it’s setup in the second diagram as it reduces the spider web in visualizations.

Are there any downsides to this second approach?

Versus

Go to Source
Author: TugboatCaptain

## SQL Server 2019 performance worse than 2012… am I missing something?

We have a SQL Server 2012 server which far outperforms a SQL Server 2019 database on (as far as I can see) the same infrastructure. We are hosting both databases on a cloud platform with the same SLAs. Both have 180GB RAM and 16 processors.

However there are a few key differences.

1. The 2012 database server is Enterprise, the 2019 is Standard. As far as I know, this shouldn’t make a difference
2. The 2012 database was restored to the 2019 server and it’s version changed to 150 (2019)
3. MAXDOP on the 2012 server was 0, 2019 server it is set to 8 as recommended by Microsoft and others
4. Cost threshold for parallelism = 5 on 2012 server, 20 on 2019 server

Other database settings were not changed, so the following settings are default on 2019, I believe:

• Legacy Cardinality Estimation = OFF
• Parameter Sniffing = ON
• Query Optimiser Fixes = OFF

Mainly the type of queries we do are large complex multi join queries performing updates and inserts, with the occasional small selects from users. We load large files to the database and then process the data in large queries, usually one at a time. In between these large “loads” we have users doing selects on other database tables not being loaded/processed in preparation for future load/process steps. Generally we are getting between 30%-50% performance reductions in processing. I figured this was because of the MAXDOP setting, but altering it to 0 made no difference over a series of runs.

Our major symptom is we are getting lock timeouts when we try to connect to the 2019 server while it is busy processing, whereas the 2012 server still services connections, just very slowly. I was thinking of setting the connection timeout setting on the server to a high amount, however I suspect we still won’t get responses from the server. It’s like it’s blocking all new connections if its even slightly busy.

Are there other things I should try? Are those database settings worth messing around with?

I could dive in further and start looking at DMVs, however this seems to be close to a “like for like” environment upgrade with considerable drops in performance. Just checking there isn’t something else I should check before doing a bigger investigation.

Go to Source
Author: blobbles