Questions tagged [text-processing]
Manipulation or examining of text by programs, scripts, etc.
8,524 questions
5
votes
4
answers
426
views
How can I find common prefixes in file names to group them?
I would like to be able to find all files in multiple directories whose file names start with the same string, but preferably not if that string is only one word or contains fewer than perhaps 5 ...
3
votes
2
answers
180
views
Embedded special characters skewing sed output
The Issue
I've been parsing a file with sed trying to tweeze out the desired data. This has worked fine for most lines in the file but there appears to be some embedded special characters that are ...
4
votes
4
answers
462
views
Remove new lines and everything after comment symbol with awk or sed
How to remove comments and newline symbols without using two pipes.
I have bookmarks.txt file with comments.
https://cookies.com # recipes cookbook
https://magicwands.com # shopping
I can copy link ...
5
votes
5
answers
355
views
Compare files and combine rows with matching values based on last column
I'm working with several files which come in bundle of four, across groups the bundels have the same number of columns; see below for an example showing the first four rows with header:
File1 has ...
2
votes
1
answer
102
views
Tmux pane with long-running session using wrong character set?
Today I connected to a long-running process in tmux over ssh for work, to find that the pane the process was running in seems to have started using the wrong character encoding for its output, leading ...
3
votes
1
answer
375
views
How to do non-greedy multiline capture with recent versions of pcre2grep?
I noticed a difference in behavior between an older pcre2grep version (10.22) and a more recent one (10.42), and I am wondering how I can get the old behavior back.
Take the following file:
aaa
bbb
...
2
votes
1
answer
87
views
Redirect `rtf` output to file
System Info
alinuxchap@libertus-desktop:/usr/share/X11/xkb $ uname -a
Linux libertus-desktop 6.12.25+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025-04-30) aarch64 GNU/Linux
alinuxchap@...
6
votes
7
answers
1k
views
How to find numbers in a textfile that are not divisible by 4096, round them up and write new file?
In Linux there is a file numbers.txt.
It contains a few numbers, all separated with space.
There are numbers like this: 5476089856 71788143 9999744134 114731731 3179237376
In this example only the ...
3
votes
5
answers
713
views
Randomly pick single line from multiple lines while assigning value to environment variable
In a certain script that we run routinely we configure hostnames in environment variables. Since hostnames can change overtime, we try to dynamically pick the current set of hosts using linux's ...
1
vote
7
answers
327
views
Extracting paragraphs with awk
What is the correct way to extract paragraphs in this log file using awk:
$ cat log.txt
par1, line1
par1, line2
par1, line3
par1, line4
par1, line5
par1, last line
par2, line1
...
2
votes
5
answers
140
views
formatting git log messages for later processing
I am trying to format and connect git log messages for later processing.
I am using git log --pretty=format:'%H %s' to get commit hash and the complete message at the moment.
I need commit messages to ...
2
votes
3
answers
223
views
How to extract specific fields from systemctl output for a custom report
I would like to build a report coming from the output of certain commands.
For instance, I have the output of such command:
systemctl --type=service --state=running |
grep -e cron -e apache2 -e ...
1
vote
3
answers
126
views
edit all the values in a specific column based on row numbers range
I have a PDB file (coordinates of atoms in a protein) on a Linux machine:
ATOM 1 N GLY A 1 0.535 51.766 5.682 1.00 0.00
ATOM 2 CA GLY A 1 -0.712 50....
0
votes
5
answers
126
views
Match multiple vars across two lines and delete entire entry
MATCH1.MATCH2 {
always same MATCH3
}
All three MATCH(es) must match.
input:
foo.bar {
always same bus
}
1.2 {
always same 3
}
a.b {
always same c
}
i.ii {
always same iii
}
b.2 {
...
4
votes
6
answers
825
views
Remove the first field (and leading spaces) with a single AWK
Consider this input and output:
foo bar baz
bar baz
How do you achieve with a single AWK? Please explain your approach too.
These are a couple tries:
$ awk '{ $1 = ""; print(substr($0, 2)) ...