7,948 questions
1
vote
1
answer
23
views
DataprocSparkSession package in python error - "RuntimeError: Error while creating Dataproc Session"
I am using below code to create Dataproc Spark Session to run a job
from google.cloud.dataproc_spark_connect import DataprocSparkSession
from google.cloud.dataproc_v1 import Session
session = Session(...
0
votes
0
answers
64
views
How to merge small parquet files in Hudi into larger files
I use Spark+ Hudi to write data into S3. I was writing data in bulk_insert mode, which cause there be many small paruqet files in Hudi table.
Then I try to schedule clustering on the Hudi table:
...
0
votes
0
answers
17
views
Apache Spark pre requisite
Problem i am facing is to install the corresponding dependencies for spark in ubuntu. For example apache spark 3.3.x needs other version of python,older version of java etc same for graphframes so i ...
1
vote
0
answers
31
views
Spark DSv2 Options vs Properties
I'm playing around with making a DSv2 data source, and I'm a bit confused about what the differences between the "options" and "properties" args passed to some of the TableProvider ...
0
votes
1
answer
40
views
Power BI connection to DorisDB fails with "Character set 'utf8mb3' is not supported by .NET Framework"
I am trying to connect Power BI Desktop to our Apache Doris database (which is the VeloDB-Doris distribution). I am using the standard MySQL data source connector in Power BI, as Doris is compatible ...
1
vote
1
answer
77
views
How to break down a column which contains several different features, so that a new column is built for each feature
I want to break down a column which contains several different features, so that a new column is built for each feature, also taking as column name the feature name. I already tried with:
data = {'...
0
votes
1
answer
71
views
Geowave or S2 index for squares and rectangles
Geowave, Geomesa and S2 Geometry offers a Hilbert index that seems suitable for a quadrilateral grid, with a unique 64-bit cell_ID per cell, for all grid levels...
However, I don't see how to use ...
0
votes
1
answer
37
views
How can I immediately reclaim disk space after dropping a table (or quickly purge its tablets)?
I’m running an Apache Doris 2.1.7 cluster (3 FEs + 6 BEs) on CentOS 7.
After issuing DROP TABLE big_fact, the table disappears from the information_schema, but the underlying tablets remain on every ...
0
votes
0
answers
12
views
"Error starting FE or unit test locally Cannot find external parser table action_table.dat"
I encountered an error while setting up and using Doris during unit testing:
Error starting FE or unit test locally Cannot find external parser table action_table.dat
I searched the community and ...
1
vote
1
answer
138
views
Apache Doris FE Cluster: "Clock delta: xxxx ms between Feeder: xxxx and this Replica exceeds max permitted delta: xxxx ms" causes BDB
I encountered an issue while running an Apache Doris FE cluster, where the fe.log file shows the following error:
2024-01-09 14:46:23,840 WARN (UNKNOWN fe_f78cf069_b094_4d9d_ac9c_ddc521dd494d(-1)|1) [...
0
votes
0
answers
37
views
Apache Doris query fails with error: [E-230]missed_versions is empty - How to diagnose and fix?
We are intermittently encountering a query failure on our Apache Doris cluster. The query fails completely with the following error message:
Query error: [E-230]missed_versions is empty
This error ...
0
votes
0
answers
46
views
Query error: "Failed to get scan range, no queryable replica found in tablet: xxxx"
During the process of setting up and using Doris, I encountered a query error:
Failed to get scan range, no queryable replica found in tablet: xxxx
This error seems to be a scanning error for the ...
0
votes
0
answers
23
views
Error: "stream load Times: Reason: Unable to display. src line [Unable to display]"
When I load data from local file to doris table, I get the error:
Reason: Unable to display. src line [Unable to display]
What happened?
My version of Apache Doris is 2.1.5, and the steps to ...
0
votes
1
answer
31
views
Query Times: process memory used 48.26 GB exceed limit 50.21 GB or sys available memory 1.54 GB less than low water mark 1.60 GB
when I execute a not complex SQL in apache Doris, I get the error as the title, what should I do to avoid this error?
my apache doris version is 2.1.5 and I have 3 FEs with 3 BEs, each BE node has ...
0
votes
1
answer
79
views
Query error: "failed to send brpc batch, error=RPC call is timed out, error_text=[E1008]Reached timeout=300000ms"
I executed a query in Doris and encountered an error:
failed to send brpc batch, error=RPC call is timed out,
error_text=[E1008]Reached timeout=300000ms
The error seems to be related to the RPC ...