Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.SI
arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Social and Information Networks

  • New submissions
  • Cross-lists
  • Replacements

See recent articles

Showing new listings for Monday, 6 October 2025

Total of 8 entries
Showing up to 2000 entries per page: fewer | more | all

New submissions (showing 1 of 1 entries)

[1] arXiv:2510.02568 [pdf, html, other]
Title: Identifying Asymptomatic Nodes in Network Epidemics using Graph Neural Networks
Conrado Catarcione Pinto, Amanda Camacho Novaes de Oliveira, Rodrigo Sapienza Luna, Daniel Ratton Figueiredo
Comments: Paper presented in the 35th Brazilian Conference on Intelligent Systems (BRACIS)
Subjects: Social and Information Networks (cs.SI); Neural and Evolutionary Computing (cs.NE); Populations and Evolution (q-bio.PE)

Infected individuals in some epidemics can remain asymptomatic while still carrying and transmitting the infection. These individuals contribute to the spread of the epidemic and pose a significant challenge to public health policies. Identifying asymptomatic individuals is critical for measuring and controlling an epidemic, but periodic and widespread testing of healthy individuals is often too costly. This work tackles the problem of identifying asymptomatic individuals considering a classic SI (Susceptible-Infected) network epidemic model where a fraction of the infected nodes are not observed as infected (i.e., their observed state is identical to susceptible nodes). In order to classify healthy nodes as asymptomatic or susceptible, a Graph Neural Network (GNN) model with supervised learning is adopted where a set of node features are built from the network with observed infected nodes. The approach is evaluated across different network models, network sizes, and fraction of observed infections. Results indicate that the proposed methodology is robust across different scenarios, accurately identifying asymptomatic nodes while also generalizing to different network sizes and fraction of observed infections.

Cross submissions (showing 6 of 6 entries)

[2] arXiv:2510.02333 (cross-list from cs.CL) [pdf, html, other]
Title: Human Mobility Datasets Enriched With Contextual and Social Dimensions
Chiara Pugliese, Francesco Lettich, Guido Rocchietti, Chiara Renso, Fabio Pinelli
Comments: 5 pages, 3 figures, 1 table
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

In this resource paper, we present two publicly available datasets of semantically enriched human trajectories, together with the pipeline to build them. The trajectories are publicly available GPS traces retrieved from OpenStreetMap. Each dataset includes contextual layers such as stops, moves, points of interest (POIs), inferred transportation modes, and weather data. A novel semantic feature is the inclusion of synthetic, realistic social media posts generated by Large Language Models (LLMs), enabling multimodal and semantic mobility analysis. The datasets are available in both tabular and Resource Description Framework (RDF) formats, supporting semantic reasoning and FAIR data practices. They cover two structurally distinct, large cities: Paris and New York. Our open source reproducible pipeline allows for dataset customization, while the datasets support research tasks such as behavior modeling, mobility prediction, knowledge graph construction, and LLM-based applications. To our knowledge, our resource is the first to combine real-world movement, structured semantic enrichment, LLM-generated text, and semantic web compatibility in a reusable framework.

[3] arXiv:2510.02637 (cross-list from physics.soc-ph) [pdf, html, other]
Title: Homophily-induced Emergence of Biased Structures in LLM-based Multi-Agent AI Systems
Aliakbar Mehdizadeh, Martin Hilbert
Comments: Accepted for publication in Social Network Analysis and Mining
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

This study examines how interactions among artificially intelligent (AI) agents, guided by large language models (LLMs), drive the evolution of collective network structures. We ask LLM-driven agents to grow a network by informing them about current link constellations. Our observations confirm that agents consistently apply a preferential attachment mechanism, favoring connections to nodes with higher degrees. We systematically solicited more than a million decisions from four different LLMs, including Gemini, ChatGPT, Llama, and Claude. When social attributes such as age, gender, religion, and political orientation are incorporated, the resulting networks exhibit heightened assortativity, leading to the formation of distinct homophilic communities. This significantly alters the network topology from what would be expected under a pure preferential attachment model alone. Political and religious attributes most significantly fragment the collective, fostering polarized subgroups, while age and gender yield more gradual structural shifts. Strikingly, LLMs also reveal asymmetric patterns in heterophilous ties, suggesting embedded directional biases reflective of societal norms. As autonomous AI agents increasingly shape the architecture of online systems, these findings contribute to how algorithmic choices of generative AI collectives not only reshape network topology, but offer critical insights into how AI-driven systems co-evolve and self-organize.

[4] arXiv:2510.02702 (cross-list from cs.CE) [pdf, html, other]
Title: VisitHGNN: Heterogeneous Graph Neural Networks for Modeling Point-of-Interest Visit Patterns
Lin Pang, Jidong J. Yang
Comments: 16 pages, 9 figures, 5 tables
Subjects: Computational Engineering, Finance, and Science (cs.CE); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

Understanding how urban residents travel between neighborhoods and destinations is critical for transportation planning, mobility management, and public health. By mining historical origin-to-destination flow patterns with spatial, temporal, and functional relations among urban places, we estimate probabilities of visits from neighborhoods to specific destinations. These probabilities capture neighborhood-level contributions to citywide vehicular and foot traffic, supporting demand estimation, accessibility assessment, and multimodal planning. Particularly, we introduce VisitHGNN, a heterogeneous, relation-specific graph neural network designed to predict visit probabilities at individual Points of interest (POIs). POIs are characterized using numerical, JSON-derived, and textual attributes, augmented with fixed summaries of POI--POI spatial proximity, temporal co-activity, and brand affinity, while census block groups (CBGs) are described with 72 socio-demographic variables. CBGs are connected via spatial adjacency, and POIs and CBGs are linked through distance-annotated cross-type edges. Inference is constrained to a distance-based candidate set of plausible origin CBGs, and training minimizes a masked Kullback-Leibler (KL) divergence to yield probability distribution across the candidate set. Using weekly mobility data from Fulton County, Georgia, USA, VisitHGNN achieves strong predictive performance with mean KL divergence of 0.287, MAE of 0.008, Top-1 accuracy of 0.853, and R-square of 0.892, substantially outperforming pairwise MLP and distance-only baselines, and aligning closely with empirical visitation patterns (NDCG@50 = 0.966); Recall@5 = 0.611). The resulting distributions closely mirror observed travel behavior with high fidelity, highlighting the model's potential for decision support in urban planning, transportation policy, mobility system design, and public health.

[5] arXiv:2510.02811 (cross-list from cs.CL) [pdf, html, other]
Title: A Computational Framework for Interpretable Text-Based Personality Assessment from Social Media
Matej Gjurković
Comments: Phd thesis
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

Personality refers to individual differences in behavior, thinking, and feeling. With the growing availability of digital footprints, especially from social media, automated methods for personality assessment have become increasingly important. Natural language processing (NLP) enables the analysis of unstructured text data to identify personality indicators. However, two main challenges remain central to this thesis: the scarcity of large, personality-labeled datasets and the disconnect between personality psychology and NLP, which restricts model validity and interpretability. To address these challenges, this thesis presents two datasets -- MBTI9k and PANDORA -- collected from Reddit, a platform known for user anonymity and diverse discussions. The PANDORA dataset contains 17 million comments from over 10,000 users and integrates the MBTI and Big Five personality models with demographic information, overcoming limitations in data size, quality, and label coverage. Experiments on these datasets show that demographic variables influence model validity. In response, the SIMPA (Statement-to-Item Matching Personality Assessment) framework was developed - a computational framework for interpretable personality assessment that matches user-generated statements with validated questionnaire items. By using machine learning and semantic similarity, SIMPA delivers personality assessments comparable to human evaluations while maintaining high interpretability and efficiency. Although focused on personality assessment, SIMPA's versatility extends beyond this domain. Its model-agnostic design, layered cue detection, and scalability make it suitable for various research and practical applications involving complex label taxonomies and variable cue associations with target concepts.

[6] arXiv:2510.02889 (cross-list from eess.SY) [pdf, html, other]
Title: Delay-Tolerant Augmented-Consensus-based Distributed Directed Optimization
Mohammadreza Doostmohammadian, Narahari Kasagatta Ramesh, Alireza Aghasi
Comments: Systems & Control Letters
Subjects: Systems and Control (eess.SY); Multiagent Systems (cs.MA); Social and Information Networks (cs.SI); Signal Processing (eess.SP); Optimization and Control (math.OC)

Distributed optimization finds applications in large-scale machine learning, data processing and classification over multi-agent networks. In real-world scenarios, the communication network of agents may encounter latency that may affect the convergence of the optimization protocol. This paper addresses the case where the information exchange among the agents (computing nodes) over data-transmission channels (links) might be subject to communication time-delays, which is not well addressed in the existing literature. Our proposed algorithm improves the state-of-the-art by handling heterogeneous and arbitrary but bounded and fixed (time-invariant) delays over general strongly-connected directed networks. Arguments from matrix theory, algebraic graph theory, and augmented consensus formulation are applied to prove the convergence to the optimal value. Simulations are provided to verify the results and compare the performance with some existing delay-free algorithms.

[7] arXiv:2510.03152 (cross-list from cs.CV) [pdf, other]
Title: ReeMark: Reeb Graphs for Simulating Patterns of Life in Spatiotemporal Trajectories
Anantajit Subrahmanya, Chandrakanth Gudavalli, Connor Levenson, Umang Garg, B.S. Manjunath
Comments: 15 pages, 3 figures, 2 algorithms, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Social and Information Networks (cs.SI)

Accurately modeling human mobility is critical for urban planning, epidemiology, and traffic management. In this work, we introduce Markovian Reeb Graphs, a novel framework for simulating spatiotemporal trajectories that preserve Patterns of Life (PoLs) learned from baseline data. By combining individual- and population-level mobility structures within a probabilistic topological model, our approach generates realistic future trajectories that capture both consistency and variability in daily life. Evaluations on the Urban Anomalies dataset (Atlanta and Berlin subsets) using the Jensen-Shannon Divergence (JSD) across population- and agent-level metrics demonstrate that the proposed method achieves strong fidelity while remaining data- and compute-efficient. These results position Markovian Reeb Graphs as a scalable framework for trajectory simulation with broad applicability across diverse urban environments.

Replacement submissions (showing 1 of 1 entries)

[8] arXiv:2407.12146 (replaced) [pdf, html, other]
Title: Physical partisan proximity outweighs online ties in predicting US voting outcomes
Marco Tonin, Bruno Lepri, Michele Tizzoni
Comments: 20 pages, 4 figures, SI Appendix: 29 pages
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)

Affective polarization and increasing social divisions affect social mixing and the spread of information across online and physical spaces, reinforcing social and electoral cleavages and influencing political outcomes. Here, using individual survey data and aggregated and de-identified co-location and online network data, we investigate the relationship between partisan exposure and vote choice in the US by comparing offline and online dimensions of partisan exposure. By leveraging various statistical modeling approaches, we consistently find that partisan exposure in the physical space, as captured by co-location patterns, more accurately predicts electoral outcomes in US counties, outperforming online and residential exposures. Similarly, offline ties at the individual level better predict vote choice compared to online connections. We also estimate county-level experienced partisan segregation and examine its relationship with individuals' demographic and socioeconomic characteristics. Focusing on metropolitan areas, our results confirm the presence of extensive partisan segregation in the US and show that offline partisan isolation, both considering physical encounters or residential sorting, is higher than online segregation and is primarily associated with educational attainment. Our findings emphasize the importance of physical space in understanding the relationship between social networks and political behavior, in contrast to the intense scrutiny focused on online social networks and elections.

Total of 8 entries
Showing up to 2000 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack