Parsing tabular data with awk issues

I am reposting this since people wanted a little more info and my question was closed, here is an example of what the output looks like, just typical tabular .txt stuff:

asdfsdf sdfsadf sdfsdf  92  83
sdfsdf  ewrwef  dsruh   32  42
sjgho   uhiu    uhgkuh  91  21

In the above, I am trying to just remove all entries where after the third tab it is below 80, and after the 4th it is below 70. So the 4th and 5th columns if viewed in excel must be above 80 and 70 respectively. In this case, only the first row should remain.

(old question)

I am trying to parse a tabular text file generated by Blastp using awk. Previously I have used this somewhat ugly code, because it worked, to go through to the right columns and cull out values below what I wanted.

#!/bin/bash 
#$ -cwd
#$ -pe mpi 16

awk '$4 > 80.0' blastoutput.txt > StepOne.txt
awk '$5 > 70.0' StepOne.txt > Culled.txt

Using it on a new blast result however, the file sizes remain at 300k kb with only a slight decrease on step one, and none for two. My best guess is that it is only recognizing a single line from the whole blast output file, and therefore not removing more. I would think maybe it had something to do with Unix/Windows line ends not being recognized as I saw on other answers, but the thing is I haven’t changed the way I’ve generated the blast results and it was working before, so I don’t know why it would all of a sudden change the way tabular results are created.

I’ve also tried using some parsing options I saw in other answers like the following:

perl -lane 'print $_ if ($F[4] >80.0)' blastp_output_8_26.txt > StepOne.txt

but the results seem to be the same.

Does anyone know what I could do to the blastp output file to make it work with my code? I am convinced something is amiss there, but all my attempts to fix it so far have been for naught.

Thanks.

Go to Source
Author: Asclepius123

Where are default aliases defined

I’ve got a fresh install of CentOS 8 (minimal ISO). I notice that, despite none being listed in either .bashrc or .bash_profile, a bunch of aliases are defined by default in bash. For example,

alias cp='cp -i'
alias egrep='egrep --color=auto'
...

Many of these aliases I’d like to keep. However, where can I find/edit the sources of those definitions?

Go to Source
Author: Daniel Walker

Can Mutt be used to access Inbox messages in a bash script?

I’ve got scripts that collect data on errors and sends them into a ticketing system. If any developments occur regarding an existing ticket, I want to be able to access the email inbox to get the ticket number so that I can include this in the reply. This will result in the ticketing system including the new information in the old ticket instead of creating a new ticket every time. Is this possible?

I’ve been looking online for how I might be able to do this but I haven’t yet found anything that looks like bash commands to access the inbox programmatically and save a message’s information to a variable.

Go to Source
Author: Tom Cayton

GNU Date: inconsistency calculating number of days between dates

I am trying to calculate the number of days between two dates like this:

$ echo $((($(date +%s -d 2016/11/22)-$(date +%s -d 2016/11/20))/(3600*24))) days
2 days

That is the expected answer and perfectly consistent with this:

$ date -d '2016/11/22 - 2 days'
Sun Nov 20 00:00:00 CET 2016

However, these two seem to be inconsistent:

$ echo $((($(date +%s -d 2020/06/28)-$(date +%s -d 2016/11/20))/(3600*24))) days
1315 days

$ date -d '2020/06/28 - 1315 days'
Mon Nov 21 00:00:00 CET 2016

Am I missing something? Why don’t I get 1316 days (instead of 1315) in the third command I used?

I have done a few more tests changing the month in the date 2020/06/28. It seems that up to March I get the expected answer (i.e., I get Nov 20 in the fourth command), but from April on the inconsistency is present (i.e., I get Nov 21 in the fourth command). Any hints?

Go to Source
Author: orelleiro

How to obtain the path of a Bash script, when it’s executed through a symlink?

I want to obtain the path of a Bash script, is when it’s executed through a symlink.

In this case, $0 is -bash, while BASH_SOURCE is the symlink’s path.

Can I obtain the script’s path through some shell variable, built-in shell command, or external command?

Will BASH_SOURCE always holds the initial symlink’s path, even if the script is executed through several levels of indirection (multiple symlinks)?

Could I use ls with BASH_SOURCE to always retrieve the script’s path?

Go to Source
Author: Shuzheng

Return “continue” from function called from loop

I’m currently refactoring a script which has slowly grown beyond control. I’m trying to spin off repetition into functions. However, I have a repeated test being called from a loop, and want it to return continue.

Shellcheck says

SC2104: In functions, use `return` instead of `continue`.

And the shellcheck wiki says don’t do it. But is there a way?

Below is an example:

#!/bin/sh

AFunction () {
    if [[ "${RED}" -lt 3 ]]; then
        echo "cont"
        continue
    else
        echo "nope"
    fi
}

for i in 1 2 3 4 5
do
    RED=${i}
    AFunction
    echo ${i}
done

This is the output:

cont
1
cont
2
nope
3
nope
4
nope
5

But I would expect

cont
cont
nope
3
nope
4
nope
5

Go to Source
Author: Stripy42

script spawn via atd leaves a process stuck in wait

I have a bash script which is run via at, I run it like
echo "bash /path/to/my/script.sh" | at -M now because I want it
to run detached.

When the script is spawned, immediately I see 2 instances (ps), and
even after my script completes successfully, another instance just
wont terminate. It will get reparented to pid 1 and just keeps doing
wait ( strace shows wait4(-1...).

I am not able to figure out why or how this second instance is forked!
When I run the script without at, I don’t see the second process.
Any hints/tips to debug this ?

Thanks.

Go to Source
Author: vyom

I set my own path to $PATH variable in BASH terminal to have global access to some of my scripts, but I can not access them in any directory

I used ~/.bash_profile to add new path to my $PATH variable. I added export PATH=”/usr/sbin:$PATH” to my ~/.bash_profile, then saved it, and typed source ~/.bash_profile in my terminal. I echo $PATH and I see that I have this path in my $PATH variable. But I cannot access lsof utility (this command is in /usr/sbin directory). Also I have this sutuation with any directories. I wanted to add a directory, I added it, I had this path in my $PATH variable, but I could not access scripts that I have there.

Go to Source
Author: Dari V

Copy and paste a folder

I am trying to make a copy of an OpenFoam tutorial folder to my desktop through Ubuntu since I don’t want to mess up the original files. I use the command line:
cp -r $FOAM_TUTORIAL /mnt/c/Users/username/Desktop

but this error pops up
cp: missing destination file operand after ‘/c/Users/username/Desktop’

Can I get any help with this?

Go to Source
Author: user421564

SSH Time out Error

I am looking for new ideas on how can I do more better.

Home laptop (home Laptop can access only Linux VM)
Linux VM (This machine can access only jump box)
jump box VM

I can ssh fine from Linux VM to jump box keys are setup.

one user is set up in Linux VM called joe and Joe .bash_profile looks like this.

ssh 10.0.0.1 || ssh 10.0.0.2

Note: There is two nic on jump box if one is down we can use the other nic for login.

Let’s say first nic is down and when I do telnet from my home laptop (telnet Linux VM) and provide user name joe it should automatically connect us to 10.0.0.1 but one is down it’s giving message timeout and taking a long time to connect to other nic.

is there any way when I do telnet and give user name check first ssh connection if down automatically connects to other second one like in 2-3 secs?

Right now it will give us a message

Time out
time out
time out
And then it will try to connect the second nic.

I can more clarify if you guys have any more questions for me.

Go to Source
Author: John

ANSWER

Resolving hosts can add up to the connection attempt, so specifying an IP address directly can shave off some of those precious milliseconds. If it’s only in a local network with the IP addresses all laid out that should not be the case. At least that’s what I think.

Perhaps explicitly setting the ConnectTimeout option to a shorter one. Like so,

ssh -o ConnectTimeout=10 user@host

This can help. Adjust the number to a sweet spot that works for your ssh setup as necessary.