Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.DB
arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Databases

Authors and titles for recent submissions

  • Mon, 6 Oct 2025
  • Fri, 3 Oct 2025
  • Thu, 2 Oct 2025
  • Wed, 1 Oct 2025
  • Tue, 30 Sep 2025

See today's new changes

Total of 27 entries
Showing up to 50 entries per page: fewer | more | all

Mon, 6 Oct 2025 (showing 2 of 2 entries )

[1] arXiv:2510.02865 [pdf, other]
Title: A New Normalization Form for Limited Distinct Attributes
Niko S. Snell, Rayen C. Lee
Comments: 11 pages
Subjects: Databases (cs.DB)
[2] arXiv:2510.03203 (cross-list from cs.IR) [pdf, other]
Title: OpenZL: A Graph-Based Model for Compression
Yann Collet, Nick Terrell, W. Felix Handte, Danielle Rozenblit, Victor Zhang, Kevin Zhang, Yaelle Goldschlag, Jennifer Lee, Daniel Riegel, Stan Angelov, Nadav Rotem
Subjects: Information Retrieval (cs.IR); Databases (cs.DB)

Fri, 3 Oct 2025 (showing 1 of 1 entries )

[3] arXiv:2510.02116 (cross-list from cs.LG) [pdf, html, other]
Title: Ensemble Threshold Calibration for Stable Sensitivity Control
John N. Daras
Comments: 10 pages, 6 tables
Subjects: Machine Learning (cs.LG); Databases (cs.DB); Machine Learning (stat.ML)

Thu, 2 Oct 2025 (showing 6 of 6 entries )

[4] arXiv:2510.00549 [pdf, html, other]
Title: EMR-AGENT: Automating Cohort and Feature Extraction from EMR Databases
Kwanhyung Lee, Sungsoo Hong, Joonhyung Park, Jeonghyeop Lim, Juhwan Choi, Donghwee Yoon, Eunho Yang
Comments: currently under submission to ICLR 2026
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[5] arXiv:2510.00089 [pdf, other]
Title: Data Quality Taxonomy for Data Monetization
Eduardo Vyhmeister, Bastien Pietropoli, Andrea Visentin
Subjects: Databases (cs.DB); Computers and Society (cs.CY)
[6] arXiv:2510.00039 [pdf, html, other]
Title: AutoPK: Leveraging LLMs and a Hybrid Similarity Metric for Advanced Retrieval of Pharmacokinetic Data from Complex Tables and Documents
Hossein Sholehrasa, Amirhossein Ghanaatian, Doina Caragea, Lisa A. Tell, Jim E. Riviere, Majid Jaberi-Douraki
Comments: Accepted at the 2025 IEEE 37th ICTAI
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[7] arXiv:2510.00566 (cross-list from cs.LG) [pdf, html, other]
Title: Panorama: Fast-Track Nearest Neighbors
Vansh Ramani, Alexis Schlomer, Akash Nayar, Panagiotis Karras, Sayan Ranu, Jignesh M. Patel
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB)
[8] arXiv:2510.00394 (cross-list from cs.LG) [pdf, html, other]
Title: Graph2Region: Efficient Graph Similarity Learning with Structure and Scale Restoration
Zhouyang Liu, Yixin Chen, Ning Liu, Jiezhong He, Dongsheng Li
Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[9] arXiv:2510.00084 (cross-list from cs.AI) [pdf, html, other]
Title: Towards a Framework for Supporting the Ethical and Regulatory Certification of AI Systems
Fabian Kovac, Sebastian Neumaier, Timea Pahi, Torsten Priebe, Rafael Rodrigues, Dimitrios Christodoulou, Maxime Cordy, Sylvain Kubler, Ali Kordia, Georgios Pitsiladis, John Soldatos, Petros Zervoudakis
Comments: Accepted for publication in the proceedings of the Workshop on AI Certification, Fairness and Regulations, co-located with the Austrian Symposium on AI and Vision (AIRoV 2025)
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Databases (cs.DB)

Wed, 1 Oct 2025 (showing 8 of 8 entries )

[10] arXiv:2509.26434 [pdf, other]
Title: The Grammar of FAIR: A Granular Architecture of Semantic Units for FAIR Semantics, Inspired by Biology and Linguistics
Lars Vogt, Barend Mons
Subjects: Databases (cs.DB)
[11] arXiv:2509.26102 [pdf, other]
Title: Experiversum: an Ecosystem for Curating and Enhancing Data-Driven Experimental Science
Genoveva Vargas-Solar (LIRIS), Umberto Costa, Jérôme Darmont (ERIC, UL2), Javier Espinosa-Oviedo (ERIC, UCBL), Carmem Hara, Sabine Loudcher (ERIC, UL2), Regina Motz, Martin A. Musicante, José-Luis Zechinelli-Martini
Journal-ref: 29th European Conference on Advances in Databases and Information Systems, Sep 2025, Tempere, Finland. pp.98-107
Subjects: Databases (cs.DB)
[12] arXiv:2509.25907 [pdf, html, other]
Title: PAT: Pattern-Perceptive Transformer for Error Detection in Relational Databases
Jian Fu, Xixian Han, Xiaolong Wan, Wenjian Wang
Subjects: Databases (cs.DB)
[13] arXiv:2509.25285 [pdf, html, other]
Title: ActorDB: A Unified Database Model Integrating Single-Writer Actors, Incremental View Maintenance, and Zero-Trust Messaging
Jun Kawasaki
Comments: 7 pages, 1 table, 1 figures. Code and data available at this https URL
Subjects: Databases (cs.DB); Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)
[14] arXiv:2509.25264 [pdf, other]
Title: GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries
Shuyang Hou, Haoyue Jiao, Ziqi Liu, Lutong Xie, Guanyu Chen, Shaowen Wu, Xuefeng Guan, Huayi Wu
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Software Engineering (cs.SE)
[15] arXiv:2509.25839 (cross-list from cs.IR) [pdf, html, other]
Title: RAE: A Neural Network Dimensionality Reduction Method for Nearest Neighbors Preservation in Vector Search
Han Zhang, Dongfang Zhao
Comments: submitted to ICLR 2026
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[16] arXiv:2509.25672 (cross-list from cs.AI) [pdf, html, other]
Title: SING-SQL: A Synthetic Data Generation Framework for In-Domain Text-to-SQL Translation
Hasan Alp Caferoğlu, Mehmet Serhat Çelik, Özgür Ulusoy
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
[17] arXiv:2509.25487 (cross-list from cs.LG) [pdf, html, other]
Title: Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph
Dingyi Kang, Dongming Jiang, Hanshen Yang, Hang Liu, Bingzhe Li
Subjects: Machine Learning (cs.LG); Databases (cs.DB); Information Retrieval (cs.IR)

Tue, 30 Sep 2025 (showing 10 of 10 entries )

[18] arXiv:2509.23775 [pdf, html, other]
Title: NeuSO: Neural Optimizer for Subgraph Queries
Linglin Yang, Lei Zou, Chunshan Zhao
Comments: Full version of "NeuSO: Neural Optimizer for Subgraph Queries", accepted to SIGMOD 2026
Subjects: Databases (cs.DB)
[19] arXiv:2509.23577 [pdf, html, other]
Title: ML-Asset Management: Curation, Discovery, and Utilization
Mengying Wang, Moming Duan, Yicong Huang, Chen Li, Bingsheng He, Yinghui Wu
Comments: Tutorial, VLDB 2025. Project page: this https URL
Journal-ref: PVLDB, 18(12): 5493 - 5498, 2025
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[20] arXiv:2509.23338 [pdf, other]
Title: PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation
Wei Zhou, Guoliang Li, Haoyu Wang, Yuxing Han, Xufei Wu, Fan Wu, Xuanhe Zhou
Comments: To appear in NeurIPS 2025. Welcome your submission to challenge our leaderboard at: this https URL. Also visit our code repository at: this https URL
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[21] arXiv:2509.24405 (cross-list from cs.CL) [pdf, html, other]
Title: Multilingual Text-to-SQL: Benchmarking the Limits of Language Models with Collaborative Language Agents
Khanh Trinh Pham, Thu Huong Nguyen, Jun Jo, Quoc Viet Hung Nguyen, Thanh Tam Nguyen
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB); Emerging Technologies (cs.ET); Information Retrieval (cs.IR)
[22] arXiv:2509.24403 (cross-list from cs.CL) [pdf, html, other]
Title: Agentar-Scale-SQL: Advancing Text-to-SQL through Orchestrated Test-Time Scaling
Pengfei Wang, Baolin Sun, Xuemei Dong, Yaxun Dai, Hongwei Yuan, Mengdie Chu, Yingqi Gao, Xiang Qi, Peng Zhang, Ying Yan
Subjects: Computation and Language (cs.CL); Databases (cs.DB)
[23] arXiv:2509.24127 (cross-list from cs.AI) [pdf, other]
Title: Transparent, Evaluable, and Accessible Data Agents: A Proof-of-Concept Framework
Nooshin Bahador
Comments: 20 pages, 11 figures
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
[24] arXiv:2509.23988 (cross-list from cs.AI) [pdf, html, other]
Title: LLM/Agent-as-Data-Analyst: A Survey
Zirui Tang, Weizheng Wang, Zihang Zhou, Yang Jiao, Bangrui Xu, Boyu Niu, Xuanhe Zhou, Guoliang Li, Yeye He, Wei Zhou, Yitong Song, Cheng Tan, Bin Wang, Conghui He, Xiaoyang Wang, Fan Wu
Comments: 35 page, 11 figures
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
[25] arXiv:2509.23942 (cross-list from cs.LG) [pdf, html, other]
Title: Efficient Identification of High Similarity Clusters in Polygon Datasets
John N. Daras
Comments: 11 pages, 3 figures
Subjects: Machine Learning (cs.LG); Databases (cs.DB); Quantitative Methods (q-bio.QM)
[26] arXiv:2509.23834 (cross-list from cs.CR) [pdf, html, other]
Title: GPM: The Gaussian Pancake Mechanism for Planting Undetectable Backdoors in Differential Privacy
Haochen Sun, Xi He
Comments: 16 pages, 7 figures. Not published yet. Code and raw experimental logs will be available after publication, or upon email request
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB)
[27] arXiv:2509.23645 (cross-list from cs.SE) [pdf, html, other]
Title: Similarity-Based Assessment of Computational Reproducibility in Jupyter Notebooks
A S M Shahadat Hossain, Colin Brown, David Koop, Tanu Malik
Comments: 10 pages
Journal-ref: ACM Conference on Reproducibility and Replicability, 2025
Subjects: Software Engineering (cs.SE); Databases (cs.DB)
Total of 27 entries
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack