0

relational column processing using awk sed File 1 ---> Scrambled data in column (Old stale files with intermediate test data)

mL9A7hajHyuVIQr1HNP7ThYfj9yBUd
Iq4iqnH4UftLgGUSobLeti0hkmdMn7
BlzanDNcIsgru2wNYlO6kDjpuPvs82
eqOZRXfdcxHqd26Raqd6ZOtPhoQp33
CrWSI2eyZZkkYlEbOoHgu2o43tU3xa
IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7
YuuRhD5f3xju0RnUCjS66g3X2TNNIj
MpJHtG8FjeErwsh6emcCu7B4bHwCnR
aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f
nXgwlL0p8LEWNFGznIy2NUXBWHzZgS

File 2---> Same data but in final order (New files after yearly audit and changing intermediate data to final form after test )

BlzanDNcIsgru2wNYlO6kDjpuPvs82
CrWSI2eyZZkkYlEbOoHgu2o43tU3xa
IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7
Iq4iqnH4UftLgGUSobLeti0hkmdMn7
MpJHtG8FjeErwsh6emcCu7B4bHwCnR
YuuRhD5f3xju0RnUCjS66g3X2TNNIj
aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f
eqOZRXfdcxHqd26Raqd6ZOtPhoQp33
mL9A7hajHyuVIQr1HNP7ThYfj9yBUd
nXgwlL0p8LEWNFGznIy2NUXBWHzZgS

Trying to explain problem From now on whatever processing we do on Final_results file we wanted to reflect the same change in intermediate test files preserving their order

paste <(cat temp_data | nl) <(cat Final_results) | column

1   CrWSI2eyZZkkYlEbOoHgu2o43tU3xa         3   BlzanDNcIsgru2wNYlO6kDjpuPvs82
2   BlzanDNcIsgru2wNYlO6kDjpuPvs82         5   CrWSI2eyZZkkYlEbOoHgu2o43tU3xa
3   IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7      6   IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7
4   Iq4iqnH4UftLgGUSobLeti0hkmdMn7         2   Iq4iqnH4UftLgGUSobLeti0hkmdMn7
5   aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f         8   MpJHtG8FjeErwsh6emcCu7B4bHwCnR
6   YuuRhD5f3xju0RnUCjS66g3X2TNNIj         7   YuuRhD5f3xju0RnUCjS66g3X2TNNIj
7   MpJHtG8FjeErwsh6emcCu7B4bHwCnR         9   aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f
8   eqOZRXfdcxHqd26Raqd6ZOtPhoQp33         4   eqOZRXfdcxHqd26Raqd6ZOtPhoQp33
9   mL9A7hajHyuVIQr1HNP7ThYfj9yBUd         1   mL9A7hajHyuVIQr1HNP7ThYfj9yBUd
10  nXgwlL0p8LEWNFGznIy2NUXBWHzZgS         10  nXgwlL0p8LEWNFGznIy2NUXBWHzZgS

Desired relational processing. If i change 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX ----> File2 (Column 2 in above commands) Then reflect the change 6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-SUFFIX --->File1 (Column1 in above commands)

Eg. What was IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 before testing was now BlzanDNcIsgru2wNYlO6kDjpuPvs82 in Fina_results. So now whatever processsin we do on BlzanDNcIsgru2wNYlO6kDjpuPvs82 in Final should reflect on IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7 in previous files

PROBLEM1 Desired Output 1

1   CrWSI2eyZZkkYlEbOoHgu2o43tU3xa         3   BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX
2   BlzanDNcIsgru2wNYlO6kDjpuPvs82         5   CrWSI2eyZZkkYlEbOoHgu2o43tU3xa
3   IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-SUFFIX   6   IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7
4   Iq4iqnH4UftLgGUSobLeti0hkmdMn7         2   Iq4iqnH4UftLgGUSobLeti0hkmdMn7
5   aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f         8   MpJHtG8FjeErwsh6emcCu7B4bHwCnR
6   YuuRhD5f3xju0RnUCjS66g3X2TNNIj         7   YuuRhD5f3xju0RnUCjS66g3X2TNNIj
7   MpJHtG8FjeErwsh6emcCu7B4bHwCnR         9   aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f
8   eqOZRXfdcxHqd26Raqd6ZOtPhoQp33         4   eqOZRXfdcxHqd26Raqd6ZOtPhoQp33
9   mL9A7hajHyuVIQr1HNP7ThYfj9yBUd         1   mL9A7hajHyuVIQr1HNP7ThYfj9yBUd
10  nXgwlL0p8LEWNFGznIy2NUXBWHzZgS         10  nXgwlL0p8LEWNFGznIy2NUXBWHzZgS

PROBLEM2 Desired Output 2 Transpose second file to first.

See in paste command output 3 BlzanDNcIsgru2wNYlO6kDjpuPvs82--SUFFIX has 3. So in first colum 3rd row needs to be changed with first row of column 4 which has 3 in third column and likewise entire output transposr columns and finally output changed data

1 mL9A7hajHyuVIQr1HNP7ThYfj9yBUd-Suffix
2 Iq4iqnH4UftLgGUSobLeti0hkmdMn7-Suffix
BlzanDNcIsgru2wNYlO6kDjpuPvs82-Suffix
4 eqOZRXfdcxHqd26Raqd6ZOtPhoQp33-Suffix
5 CrWSI2eyZZkkYlEbOoHgu2o43tU3xa-Suffix
6 IUnZ8VPgw0FuqJmsY6FYfUpMdDNnk7-Suffix
7 YuuRhD5f3xju0RnUCjS66g3X2TNNIj-Suffix
8 MpJHtG8FjeErwsh6emcCu7B4bHwCnR-Suffix
9 aQ1DXaG7XSopgxBeBsRRZcCh2xRu5f-Suffix
10 nXgwlL0p8LEWNFGznIy2NUXBWHzZgS-Suffix

##Simplified data (for testing solutions)

1  Mina_Warren     2  Ayden_Silva
2  Jazlene_Gibbs   4  Quintin_Glover
3  Kaleigh_Farley  1  Callum_Mckay
4  Callum_Mckay    7  Jazlene_Gibbs
5  Finn_Nelson     6  Mina_Warren
6  Ayden_Silva     3  Kaleigh_Farley
7  Quintin_Glover  5  Finn_Nelson

Output (ONLY FOR PROBLEM 1)

1  Mina_Warren--Suffix3     2  Ayden_Silva---Suffix1
2  Jazlene_Gibbs--Suffix1   4  Quintin_Glover--Suffix2
3  Kaleigh_Farley--Suffix6  1  Callum_Mckay--Suffix3
4  Callum_Mckay--Suffix2    7  Jazlene_Gibbs-Suffix4
5  Finn_Nelson--Suffix7     6  Mina_Warren--Suffix5
6  Ayden_Silva--Suffix5     3  Kaleigh_Farley--Suffix6
7  Quintin_Glover--Suffix4  5  Finn_Nelson-Suffix7

(ONLY FOR PROBLEM 1) Here Suffix --- Means ---> Any kind of processing editing renaming replacing etc. Any Processing in 4th column of certain line will reflect same on 2nd column of nth line. ---> Here nth line is obtained by third column

Logic for Problem 1 for awk --Read first record. --Store column $4 in variable and --then goto line number as shown in column3 $3 --then replce column $2 with varible stored from $4

Consider

1  Mina_Warren     2  Ayden_Silva
2  Jazlene_Gibbs   4  Quintin_Glover

Third columns are linenumber for relational edit. Logic for problem 1 Read row 1 ---> Store Ayde _Silva in variable ---> Goto row 2 because of 2 in $3 of row 1 ---> Now on row 2 have same prcessing on Jazlene_Gibbs

Desired Output Problem 1

Mina_Warren     2  Ayden_Silva--Suffix1
2  Jazlene_Gibbs--Suffix1   4  Quintin_Glover

Logic for problem 2 Transposing Read row 1 ---> Store Ayde _Silva in variable ---> Goto row 2 because of 2 in $3 of row 1 ---> Now on row 2 replace Jazlene_Gibbs with processed version of Ayde_Silve ---> Do these for all lines in loop ---> Delete column 3 and column 4

Desired output Problem 2

1  Callum_Mckay--Suffix3     
2  Ayden_Silva--Suffix1   
3  Kaleigh_Farley--Suffix6  
4  Quintin_Glover--Suffix2    
5  Finn_Nelson--Suffix7     
6  Mina_Warren--Suffix5     
7  Jazlene_Gibbs--Suffix4  

Tried

in="$(awk 'END { print NR }' 1)" file
awk -v ty=$in '{for (i=1;i<=ty;i++) NR==$i; var=$4; varb=$3; NR == varb; $2=var; print}' file

but not working as intended logic what i tried to do is in = total numbet of records here 7 then using for loop to loop 7 times---> i =1; NR==1 ##goto rrcord 1 --> store $4 in var and $3 in varb --> NR==varb ## goto record as specified in varb eg 2. them replcace $2 with var. loop over ---> i =2 goto record two and likewise

13
  • Is --suffix always separated by dashes or is the crumbled data string always of the same lenght (30)? This is important for splitting the string so it can be compared between left and right and the suffix applied to left correctly. Commented Jan 22, 2022 at 10:04
  • 1
    @Tathastu "any kind of processing" is too unspecific. How is "processing" visible, how does one know if it was processed or not? Does changing the first letter count as processing? If so how does one know? OR is it really just all at the end of the string and the original string always has the same length? Please be on-point or solutions will not work; e.g. people trusting that the dash exists, but you say it might not be there. Commented Jan 22, 2022 at 10:29
  • 1
    Re: multiple accounts: unix.stackexchange.com/users/508416/tathastu-pandya unix.stackexchange.com/users/509083/tathastu-pandya unix.stackexchange.com/users/509185/tathastu-pandya unix.stackexchange.com/users/414574/tathastu-pandya unix.stackexchange.com/users/510830/kung and more? Please talk to the moderators about how to merge all of your accounts into 1 so you can develop a reputation, gain privileges, edit your own questions, etc. Commented Jan 22, 2022 at 13:45
  • 2
    @EdMorton Thank you for helping the OP get things straightened out! There's actually a self-service form for merging accounts. It's at the bottom of each page under Contact -- "What can we help you with?" = "I want to merge user profiles". It seems like the OP is having trouble registering accounts because they're clearing their browser cookies. It's been a while since I've created an account, but I believe you can register via Google, Facebook, or a username & password. You can then log back in the same way. Hope this helps! Commented Jan 22, 2022 at 16:18
  • 1
    @Tathastu there's no place for rude or abusive language here on Stack Exchange. I've deleted your previous comment. Please abide by the Code of Conduct while you're here, guest or not. Thank you. Commented Feb 3, 2022 at 15:09

1 Answer 1

0

Using cppawk:

This is a new kid on the blawk, developed just over this past month.

cppawk is preprocessed with the C preprocessor, producing input for regular Awk, and comes with some library headers for Lisp-like list processing, fancy iteration, and other utilities.

Problem 1:

#include <cons.h>

BEGIN {
  bag = list_begin()
}

{
  left[$1] = $2
  right[$3] = $4

  leftn[$3] = $1

  bag = list_add(bag, $3)
}

END {
  finlist = list_end(bag)

  dolist (i, finlist)
  {
    left[i] = left[i] "--Suffix" ++suff
    right[i] = right[i] "--Suffix" suff
  }

  dolist (i, finlist)
  {
     print leftn[i], left[leftn[i]], i, right[i]
  }
}

Output:

cppawk -f prob-1.cwk prob-1-data 
1 Mina_Warren--Suffix3 2 Ayden_Silva--Suffix1
2 Jazlene_Gibbs--Suffix1 4 Quintin_Glover--Suffix2
3 Kaleigh_Farley--Suffix6 1 Callum_Mckay--Suffix3
4 Callum_Mckay--Suffix2 7 Jazlene_Gibbs--Suffix4
5 Finn_Nelson--Suffix7 6 Mina_Warren--Suffix5
6 Ayden_Silva--Suffix5 3 Kaleigh_Farley--Suffix6
7 Quintin_Glover--Suffix4 5 Finn_Nelson--Suffix7

Problem 2:

#include <cons.h>

BEGIN {
  bag = list_begin()
}

{
  left[$1] = $2
  right[$3] = $4

  leftn[$3] = $1

  bag = list_add(bag, $3)
}

END {
  finlist = list_end(bag)

  dolist (i, finlist)
  {
    left[i] = right[i] "--Suffix" ++suff
  }

  dolist (i, finlist)
  {
     print leftn[i], left[leftn[i]]
  }
}

Output:

cppawk -f prob-2.cwk prob-1-data 
1 Callum_Mckay--Suffix3
2 Ayden_Silva--Suffix1
3 Kaleigh_Farley--Suffix6
4 Quintin_Glover--Suffix2
5 Finn_Nelson--Suffix7
6 Mina_Warren--Suffix5
7 Jazlene_Gibbs--Suffix4

You must log in to answer this question.