0

Using power query we can split a column into data that is non digit to digit which works great when you have a value such as Lead 10 to split into Lead and 10 however is there anyway to split in the same way if the number is a decimal e.g. Lead 20.5. Using split non digit to digit splits this is Lead 20. 5

I have the following example data I wish to split as follows:

Lead 20.5 --> `Lead` `20.5`
No Data --> `null`
Arsenic 10 --> `Arsenic` `10`
Gold 50.55 --> `Gold` `50.55`
1,4-Dioxane 21 --> `1,4-Dioxane` `21`

Previously I used split by right most "" however this splits No Data into separate words.

Any ideas on how to achieve this would be great.

Update 1: Issue 1,4-Dioxane enter image description here

M Code:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Split Column by Character Transition" = Table.SplitColumn(#"Changed Type", "Column1", 
        Splitter.SplitTextByCharacterTransition((c) => not List.Contains({"0".."9","."}, c), {"0".."9","."}), {"Column1.1", "Column1.2"})

in
     #"Split Column by Character Transition"
1
  • Have you tried left() and right() with find() using the space in find? Commented Jun 24, 2021 at 11:39

2 Answers 2

2

How to do this depends on your data.

Edit
to account for additional data sample with digits in chemical name

Algorithm

  • Test the last word
    • if last word is NOT a number, then replace spaces with NBSP
    • Then split on the rightmost space.

I will use a custom function to check the last word and modify the string if the last word is not a space

Custom Function M Code:
enter as a blank query and rename it: fnConvString
Edited to improve computation

//Rename this query "fnConvString"
(string as text) =>
let 
   lastWord = Text.AfterDelimiter(string," ",{0,RelativePosition.FromEnd}),
   lastIsNumber = try Value.Type(Number.FromText(lastWord)) = type number otherwise false,
   replSpace = if lastIsNumber = false then Text.Replace(string," ",Character.FromNumber(160)) else string
in 
   replSpace

Main MCode
Edited to simplify code with no added columns

let
    Source = Excel.CurrentWorkbook(){[Name="Table29"]}[Content],
    addNBSP = Table.TransformColumns(Source,{"Column1", each fnConvString(_)}),
    #"Split Column by Delimiter" = Table.SplitColumn(addNBSP, "Column1", 
        Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
    #"Changed Type" = Table.TransformColumnTypes(
        #"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type number}})
in
    #"Changed Type"

enter image description here

Edit without custom function

If you would prefer to not use a custom function, you can incorporate that within the main code as a Transform Operation:

M Code without custom function

let
    Source = Excel.CurrentWorkbook(){[Name="Table29"]}[Content],
    
    addNBSP = Table.TransformColumns(Source,{"Column1", each 
        let 
            lastWord = Text.AfterDelimiter(_," ",{0,RelativePosition.FromEnd}),
            lastIsNumber = try Value.Type(Number.FromText(lastWord)) = type number otherwise false,
            replSpace = if lastIsNumber = false then Text.Replace(_," ",Character.FromNumber(160)) else _
        in 
            replSpace
    }),

    #"Split Column by Delimiter" = Table.SplitColumn(addNBSP, "Column1", 
        Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
    #"Changed Type" = Table.TransformColumnTypes(
        #"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type number}})
in
    #"Changed Type"
Sign up to request clarification or add additional context in comments.

7 Comments

Hi there this is fine since there shouldn't be a lone . however I have noticed that its also splitting if a material has a comma. Any thoughts?
@nick Please show a data example. I cannot reproduce from what you write.
@Nick Maybe you inadvertently included a comma within the quote marks when you edited your M Code?
Please see update. I copied the code as exact and yet 1,4-dioxane 21 splits into 1, 4-Dixoane
@Nick The problem is not that the material has a comma; the problem is that the material has digits. Let me look at that more closely.
|
1

In powerquery, based on sample data, looks like you could just split on the space character.

Right click column .. split column .. by delmiter ... delimiter:space ... Split at: leftmost-delimiter

let  Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Column1.1", "Column1.2"})
in  #"Split Column by Delimiter"

If data doesn't like that method, you could parse numerical from alpha

Add custom column with formula

= Text.Remove([Column1],{"0".."9","."})

to get the text only portion, and adding a second custom column with formula

=try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null

to get the numerical portion

Sample full code

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Text", each Text.Remove([Column1],{"0".."9","."})),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Numeric", each try Text.Remove([Column1],Text.ToList(Text.Remove([Column1],{"0".."9","."}))) otherwise null)
in  #"Added Custom1"

3 Comments

This is good but I have issues when spliting data such as 1,4-Dioxane 21. It splits into ,-Dioxane and 1421.
I appreciate i didn't include this in the example data. I have added it now to improve the question.
Splitting on a space still looks like it would have worked fine

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.