310 questions
Best practices
0
votes
0
replies
68
views
Azure Data factory vs Microsoft Fabric
I have recently started a new job as a data engineer. The current setup for downstream data is data is being pushed to a Storage BLOB in Azure and is being ingested with ADF to a Azure SQL DB. The ...
0
votes
1
answer
77
views
Data transformation in PowerBI for a column where data source is Azure Synapse Analytics SQL
I got my data in PowerBI by connecting to Azure Synapse, where I need to transform a column in PowerBI where the column (text) contains list of numeric values eg. [0,1.0,0.05] in a single row needs to ...
0
votes
1
answer
120
views
Formatting csv file format in pyspark
I have a | delimited csv file with data as shown below.
AccountID|BounceSubcategory|BounceTypeID|BounceType|SMTPBounceReason|SMTPMessage|SMTPCode|TriggererSendDefinitionObjectID|...
2
votes
1
answer
75
views
How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?
I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing.
Is there an easy way to fit this scaler not over the whole data set, but per group? ...
2
votes
5
answers
113
views
Dropping grouped rows based on a certain hierarchical column
Suppose I have this pandas dataset:
ID
Question Code
1
Q01
1
Q01-1
1
Q02
2
Q01
2
Q02
2
Q02-1
2
Q02-1-1
2
Q02-2
I want to remove the rows based on certain hierarchical conditions between the values of ...
-1
votes
1
answer
77
views
How to Transform a Soccer Match DataFrame to a Long Format with Separate Rows for Home and Away Teams in R [duplicate]
I have a DataFrame in R with the following columns:
season: The season of the match (e.g., "2015/2016")
stage: The stage or round of the match (e.g., 1 for Round 1)
home_team_api_id: The ID ...
-1
votes
1
answer
63
views
Data-transformation with R and dplyr [closed]
I have data in the following format:
Location
Species
Date
Count
Location1
Species1
01-01-2024
2
Location1
Species1
01-02-2024
4
Location1
Species1
01-03-2024
3
Location1
Species2
01-01-2024
6
...
0
votes
2
answers
2k
views
How to Reference Previous Row Value in Power Query for Custom Column Logic?
I’m working in Power Query and trying to create a Custom Column that mimics the following Excel formula, im struggling with mcode:
=IF(A2 > 0, IF(A2 = A1, 0, A2), A2)
The goal is:
If the current ...
0
votes
1
answer
56
views
Error in M language while Calculating Fiscal Quarter
Errors: Expected to find a right parenthesis <')'>, but a keyword <'then'> was found instead
Power Query Editor
I'm trying to find a solution for inserting a new column in Power Query ...
1
vote
1
answer
121
views
Excel how to merge duplicate rows into a single row with additional columns?
I need help formatting my data as shown in the image below. These are only 3 columns, I have so many of these. How can I format this.
Current State
Desired State
For a context, my dataset has 2100 ...
-1
votes
1
answer
330
views
How to Separate Data with Inconsistent Patterns into a Structured Format in Excel
Inconsistent Values in Cells
I'm working with a dataset where multiple values in a cell are tagged under categories like Location, Host, Guest, and Bucket, and separated by line breaks. I need to ...
0
votes
1
answer
115
views
Change x-axis scale to cuberoot without transforming raw data using trans_new() [closed]
I need to change the x-axis of my ggplot figure to a cuberoot scale, without transforming the raw data. My code below had been working but with the new R update, I am getting the error,
Error in if (...
0
votes
0
answers
80
views
PowerBi - Datetime value loses 1 second when changing the field type from DateTime to Text
I have a DateTime value called "Hour". It represents each hour (rounded) of the day.
Due to other transformations I want to apply, I need to change the field's type to Text.
However, when I ...
0
votes
2
answers
48
views
How do I create a new column in my DF of daily measurements that gives me the increase between today's and yesterday's measurement?
I have a column of datapoints for daily measurements in my DF. I would like to add a new column to said DF that tells me the increase or decrease of this measurement in comparison to yesterday's.
...
0
votes
1
answer
50
views
Trying to convert time format 0:00:00 to seconds (integer)
Trying to convert time format 0:00:00 to seconds (integer), to use as a derive column in SSIS. I've tried (DT_I4)TOKEN(column,":",1) * 60 + (DT_I4)TOKEN(column,":",2), but it is ...