Graph Neural Networks (GNNs) have been widely used across various fields under the homophily assumption that connected nodes are similar. However, in heterophilic graphs, where connected nodes tend to have dissimilar features, existing GNNs still face some limitations. From the perspective of structure, shallow GNNs could not capture the high-order node information, whereas deep GNNs may suffer from the over-smoothing problem. From the perspective of feature, the useful information of high-order similar nodes is often weakened by low-order dissimilar nodes in the feature update phase. To address the above problems, we propose a Global Structure-aware and Feature-augmented Graph Neural Network (GSF-GNN) to alleviate the limitations from the perspectives of structure and feature. Specifically, from the structure perspective, we design a Structure-based Global Propagation (SGP) module to establish global connections among nodes and adaptively adjust edge weights for message propagation. From the feature perspective, we introduce a Feature-augmented Compensatory Update (FCU) module, which employs a multi-view feature updating mechanism to enhance node features from different perspectives. Our theoretical analysis formally demonstrates the effectiveness of GSF-GNN in heterophilic graphs. Experiments on heterophilic and homophilic benchmark datasets validate the effectiveness of GSF-GNN across various graph structures. Moreover, GSF-GNN achieves stable performance across multiple layers and effectively alleviates the over-smoothing problem. Our codes are available on https://github.com/huijieliu2023/GSF-GNN.
Time series forecasting models are becoming increasingly prevalent due to their critical role in decision-making across various domains. However, most existing approaches represent the coupled temporal patterns, often neglecting the distinction between their specific components. In particular, fluctuating patterns and smooth trends within time series exhibit distinct characteristics. In this work, to model complicated temporal patterns, we propose a Conditional Denoising Polynomial Modeling (CDPM) framework, where probabilistic diffusion models and deterministic linear models are trained end-to-end. Instead of modeling the coupled time series, CDPM decomposes it into trend and seasonal components for modeling them separately. To capture the fluctuating seasonal component, we employ a probabilistic diffusion model based on statistical properties from the historical window. For the smooth trend component, a module is proposed to enhance linear models by incorporating historical dependencies, thereby preserving underlying trends and mitigating noise distortion. Extensive experiments conducted on six benchmarks demonstrate the effectiveness of our framework, highlighting the potential of combining probabilistic and deterministic models. Our code is available at https://github.com/zjt-gpu/CDPM.
Intraoperative hypotension (IOH) prediction using past physiological signals is crucial, as IOH may lead to inadequate organ perfusion and significantly elevate the risk of severe complications and mortality. However, current methods often rely on static modeling, overlooking the complex temporal dependencies and the inherently non-stationary nature of physiological signals. We propose a Hybrid Multi-Factor (HMF) network that formulates IOH prediction as a dynamic sequence forecasting task, explicitly capturing both temporal dependencies and physiological non-stationarity. We represent signal dynamics as multivariate time series and decompose them into trend and seasonal components, enabling separate modeling of long-term and periodic variations. Each component is encoded with a patch-based Transformer to balance computational efficiency and feature representation. To address distributional drift from evolving signals, we introduce a symmetric normalization mechanism. Experiments on both public and real-world clinical datasets show that HMF significantly outperforms competitive baselines. We hope HMF offers new insights into IOH prediction and ultimately promotes safer surgical care. Our code is available at https://github.com/Mingyue-Cheng/HMF.
Large language models (LLMs) have shown promise in multivariate time series classification (MTSC). To effectively adapt LLMs for MTSC, it is crucial to generate comprehensive and informative data representations. Most methods utilizing LLMs encode numerical time series into the model’s latent space, aiming to align with the semantic space of LLMs for more effective learning. Despite effectiveness, we highlight three limitations that these methods overlook: (1) they struggle to incorporate temporal and channel-specific information, both of which are essential components of multivariate time series; (2) aligning the learned representation space with the semantic space of the LLMs proves to be a significant challenge; (3) they often require task-specific retraining, preventing training-free inference despite the generalization capabilities of LLMs. To bridge these gaps, we propose TableTime, which reformulates MTSC as a table understanding task. Specifically, TableTime introduces the following strategies: (1) utilizing tabular form to unify the format of time series, facilitating the transition from the model-centric approach to the data-centric approach; (2) representing time series in text format to facilitate seamless alignment with the semantic space of LLMs; (3) designing a knowledge-task dual-driven reasoning framework, TableTime, integrating contextual information and expert-level reasoning guidance to enhance LLMs’ reasoning capabilities and enable training-free classification. Extensive experiments conducted on 10 publicly available benchmark datasets from the UEA archive validate the substantial potential of TableTime to be a new paradigm for MTSC. The code is publicly available. https://github.com/realwangjiahao/TableTime.
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs) through the integration of external knowledge. While numerous efforts, most studies focus on a single type of external knowledge source. However, in real-world applications, most situations involve diverse knowledge from various sources, yet this area has been less explored. The main dilemma is the lack of a suitable dataset containing multiple knowledge sources and pre-exploration of the associated issues. To address these challenges, we standardize a benchmark dataset that combines structured and unstructured knowledge across diverse and complementary domains. Based on this dataset, we further develop a plug-and-play RAG framework, PruningRAG, whose main characteristic is the use of multi-granularity pruning strategies to optimize the integration of relevant information while minimizing misleading context. It consistently improves performance across various existing RAG variants, demonstrating its robustness and broad applicability. Building upon the standardized dataset and PruningRAG, we also report a series of experimental results, as well as insightful findings. Our dataset and code are publicly available. https://github.com/USTCAGI/PruningRAG, with the aim of advancing future research in the RAG community.
Logical reasoning with large language models (LLMs) has received growing attention. One mainstream approach translates natural language into formal logic and then applies symbolic solvers for deduction. While effective in many tasks, these LLM-based translators often fail to generate consistent symbolic representations when the same concept appears in different linguistic forms. Such inconsistencies break logical coherence and lead to solver errors. However, most existing benchmarks lack this type of linguistic variation, which frequently occurs in real-world text, leaving the problem underexplored. To address this gap, we present SoLT, a benchmark that systematically rewrites reasoning datasets into diverse yet logically equivalent forms across multiple levels. Beyond evaluation, SoLT also provides a general method to enrich any dataset with linguistic diversity while preserving both meaning and logic. To further enhance the stability of LLM-based reasoning, we propose MenTaL, which explicitly guides models to build a concept-symbol mapping table during translation. By linking equivalent expressions to shared symbols, MenTaL maintains consistency and mitigates symbol drift. Experiments on SoLT demonstrate that LLMs indeed suffer from inconsistent symbol mapping under linguistic variation, leading to significant drops in reasoning accuracy. Meanwhile, applying MenTaL brings clear and stable performance improvements across diverse inputs. Overall, our findings reveal that overlooking linguistic diversity hides key weaknesses in LLM-based translators, and our work offers a step toward more reliable logical reasoning in varied real-world scenarios. Our code is available at https://github.com/qingchuanli/LinguDiver.
Designing effective models for learning time series representations is foundational for time series analysis. Many previous works explore time series representation modeling approaches and make progress in this area. Despite their effectiveness, they lack adaptive perception of local patterns in temporally dependent basic units and fail to capture the multi-scale dependency among these units. Instead of relying on prevalent methods centered around self-attention mechanisms, we propose ConvTimeNet, a hierarchical pure convolutional model designed for time series analysis. ConvTimeNet introduces a deformable patch layer that adaptively perceives local patterns of temporally dependent basic units in a data-driven manner. Based on the extracted local patterns, hierarchical pure convolutional blocks are designed to capture dependency relationships among the representations of basic units at different scales. Moreover, a large kernel mechanism is employed to ensure that convolutional blocks can be deeply stacked, thereby achieving a larger receptive field. In this way, local patterns and their multi-scale dependencies can be effectively modeled within a single model. Extensive experiments comparing a wide range of different types of models demonstrate that pure convolutional models still exhibit strong viability, effectively addressing the aforementioned two challenges and showing superior performance across multiple tasks. The code is available for reproducibility. https://github.com/Mingyue-Cheng/ConvTimeNet
Time series forecasting (TSF) has become an increasingly vital tool in various decision-making applications, including business intelligence and scientific discovery, in today’s rapidly evolving digital landscape. Over the years, a wide range of methods for TSF has been proposed, spanning from traditional static-based models to more recent machine learning-driven, data-intensive approaches. Despite the extensive body of research, there is still no universally accepted, unified problem statement or systematic elaboration of the core challenges and characteristics of TSF. The extent to which deep TSF models can address fundamental issues-such as data sparsity and non-stationarity-remains unclear, and the broader TSF research landscape continues to evolve, shaped by diverse methodological trends. This comprehensive survey aims to address these gaps by examining the key entities in TSF (e.g., covariates) and their characteristics (e.g., frequency, length, missing values). We introduce a general problem formulation and challenge analysis for TSF, propose a taxonomy that classifies representative methodologies from the preprocessing and forecasting perspectives, and highlight emerging topics like transfer learning and trustworthy forecasting. Finally, we discuss promising research directions that are poised to drive innovation in this dynamic and rapidly advancing field. The related paper list is available at https://github.com/USTCAGI/Awesome-Papers-Time-Series-Forecasting.
Table mining is a popular research field that involves complicated technologies, including information retrieval, data mining, visual and textual understanding and logical reasoning. With the emergence of Large Language Models (LLMs), the field has witnessed considerable advancements, presenting new paradigms for table understanding, extraction, and reasoning. In this survey, we conduct a comprehensive review of the literature on table mining with LLMs. We begin by introducing the fundamental overview of tabular data and possible challenges in LLM-based table mining. Specifically, we explore the challenges unique to this domain, such as heterogeneous table structures, contextual ambiguity, and domain-specific knowledge requirements. Then, we summarize representative tabular tasks in table preparation and mining, categorizing existing methods along dimensions including task scope, model architecture, and application scenarios. Next, we describe advanced LLM-based learning strategies in table mining, including foundation models and training-free methods. As specific issues, we review studies of trustworthy LLM-based table mining and some domain-specific applications. Finally, we discuss prospects and future directions in the field of LLM-based table reasoning, including issues of generalization, interpretability, efficiency, etc. We hope this survey can present valuable resources for researchers and practitioners, paving the way for further exploration in this field. The repository is at: https://github.com/USTCAGI/Awesome-LLM-Table-Mining.
For the advancement of time series classification, we can summarize that most existing methods adopt a common learning-to-classify paradigm - a classifier model tries to learn the relation between sequence inputs and target label encoded by one-hot distribution. Although effective, this paradigm conceals two inherent limitations: (1) one-hot distribution fails to reflect the comparability and similarity between labels, and (2) it is difficult to learn transferable representation across domains. In this work, we propose InstructTime, a novel attempt to reshape time series classification as a learning-to-generate paradigm. Relying on the generative capacity of the pre-trained language model, the core idea is to formulate the classification of time series as a multimodal understanding task. Specifically, firstly, a time series discretization module is designed to convert continuous inputs into a sequence of discrete tokens to solve the inconsistency issue across modality data. Secondly, we introduce an alignment projected layer before feeding the transformed token of time series into language models. Thirdly, prior to fine-tuning the language model for the target domain, it is essential to emphasize the necessity of auto-regressive pre-training across various modality inputs. Finally, extensive experimentation are conducted on several prevalent public benchmark datasets, indicating the superior performance of the InstructTime. Our code is at https://github.com/Mingyue-Cheng/InstructTime.
Pre-training universal models across multiple domains to enhance downstream tasks is a prevalent learning paradigm. However, there has been minimal progress in pre-training transferable models across domains for time series representation. This dilemma is incurred by two key factors: the limited availability of training set within each domain and the substantial differences in data characteristics between domains. To address these challenges, we present a novel framework, namely CrossTimeNet, designed to perform cross-domain self-supervised pre-training to benefit target tasks. Specifically, to address the issue of data scarcity, we utilize a pre-trained language model as the backbone network to effectively capture the sequence dependencies of the input time series. Meanwhile, we adopt the recovery of corrupted region inputs as a self-supervised optimization objective, taking into account the locality of the time series. To address discrepancies in data characteristics, we introduce a novel tokenization module that converts continuous time series inputs into discrete token sequences using vector quantization techniques. This approach facilitates the learning of transferable time series models across different domains. Extensive experimental results on diverse time series tasks, including classification and forecasting, demonstrate the effectiveness of our approach. Our codes are publicly available at https://github.com/Mingyue-Cheng/CrossTimeNet.
Large language model evaluation plays a pivotal role in the enhancement of its capacity. Previously, numerous methods for evaluating large language models have been proposed in this area. Despite their effectiveness, these existing works mainly focus on assessing objective questions, overlooking the capability to evaluate subjective questions which is extremely common for large language models. Additionally, these methods predominantly utilize centralized datasets for evaluation, with question banks concentrated within the evaluation platforms themselves. Moreover, the evaluation processes employed by these platforms often overlook personalized factors, neglecting to consider the individual characteristics of both the evaluators and the models being evaluated. To address these limitations, we propose a novel anonymous crowd-sourcing evaluation platform, BingJian, for large language models that employs a competitive scoring mechanism where users participate in ranking models based on their performance. This platform stands out not only for its support of centralized evaluations to assess the general capabilities of models but also for offering an open evaluation gateway. Through this gateway, users have the opportunity to submit their questions, testing the models on a personalized and potentially broader range of capabilities. Furthermore, our platform introduces personalized evaluation scenarios, leveraging various forms of human-computer interaction to assess large language models in a manner that accounts for individual user preferences and contexts. The demonstration of BingJian can be accessed at https://github.com/Mingyue-Cheng/Bingjian.
Generating user-friendly explanations regarding why an item is recommended has become increasingly common, largely due to advances in language generation technology, which can enhance user trust and facilitate more informed decision-making when using online services. However, existing explainable recommendation systems focus on using small-size language models. It remains uncertain what impact replacing the explanation generator with the recently emerging large language models (LLMs) would have. Can we expect unprecedented results? In this study, we propose LLMXRec, a simple yet effective two-stage explainable recommendation framework aimed at further boosting the explanation quality by employing LLMs. Unlike most existing LLM-based recommendation works, a key characteristic of LLMXRec is its emphasis on the close collaboration between previous recommender models and LLM-based explanation generators. Specifically, by adopting several key fine-tuning techniques, including parameter-efficient instructing tuning and personalized prompt techniques, controllable and fluent explanations can be well generated to achieve the goal of explanation recommendation. Most notably, we provide three different perspectives to evaluate the effectiveness of the explanations. Finally, we conduct extensive experiments over several benchmark recommender models and publicly available datasets. The experimental results not only yield positive results in terms of effectiveness and efficiency but also uncover some previously unknown outcomes. To facilitate further explorations in this area, the full code and detailed original results are open-sourced at https://github.com/GodFire66666/LLM_rec_explanation/.
Enhancing the expressive capacity of deep learning-based time series models with self-supervised pre-training has become ever-increasingly prevalent in time series classification. Even though numerous efforts have been devoted to developing self-supervised models for time series data, we argue that the current methods are not sufficient to learn optimal time series representations due to solely unidirectional encoding over sparse point-wise input units. In this work, we propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks. The distinct characteristics of the TimeMAE lie in processing each time series into a sequence of non-overlapping sub-series via window-slicing partitioning, followed by random masking strategies over the semantic units of localized sub-series. Such a simple yet effective setting can help us achieve the goal of killing three birds with one stone, i.e., (1) learning enriched contextual representations of time series with a bidirectional encoding scheme; (2) increasing the information density of basic semantic units; (3) efficiently encoding representations of time series using transformer networks. Nevertheless, it is a non-trivial to perform reconstructing task over such a novel formulated modeling paradigm. To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture, which learns the representations of visible (unmasked) positions and masked ones with two different encoder modules, respectively. Furthermore, we construct two types of informative targets to accomplish the corresponding pretext tasks. One is to create a tokenizer module that assigns a codeword to each masked region, allowing the masked codeword classification (MCC) task to be completed effectively. Another one is to adopt a siamese network structure to generate target representations for each masked input unit, aiming at performing the masked representation regression (MRR) optimization. Comprehensively pre-trained, our model can efficiently learn transferrable time series representations, thus benefiting the classification of time series. We extensively perform experiments on five benchmark datasets to verify the effectiveness of the TimeMAE. Experimental results show that the TimeMAE can significantly surpass previous competitive baselines. Furthermore, we also demonstrate the universality of the learned representations by performing transfer learning experiments. For the reproducibility of our results, we make our experiment codes publicly available to facilitate the self-supervised representations of time series in https://github.com/Mingyue-Cheng/TimeMAE.
Recent advances in diffusion models have brought remarkable visual fidelity to instruction-guided image editing. However, their global denoising process inherently entangles the edited region with the entire image context, leading to unintended spurious modifications and compromised adherence to editing instructions. In contrast, autoregressive models offer a distinct paradigm by formulating image synthesis as a sequential process over discrete visual tokens. Their causal and compositional mechanism naturally circumvents the adherence challenges of diffusion-based methods. In this paper, we present VAREdit, a visual autoregressive (VAR) framework that reframes image editing as a next-scale prediction problem. Conditioned on source image features and text instructions, VAREdit generates multi-scale target features to achieve precise edits. A core challenge in this paradigm is how to effectively condition the source image tokens. We observe that finest-scale source features cannot effectively guide the prediction of coarser target features. To bridge this gap, we introduce a Scale-Aligned Reference (SAR) module, which injects scale-matched conditioning information into the first self-attention layer. VAREdit demonstrates significant advancements in both editing adherence and efficiency. On EMU-Edit and PIE-Bench benchmarks, VAREdit outperforms leading diffusion-based methods by a substantial margin in terms of both CLIP and GPT scores. Moreover, VAREdit completes a 512x512 editing in 1.2 seconds, making it 2.2x faster than the similarly sized UltraEdit. Code is available at: https://github.com/HiDream-ai/VAREdit.
Self-supervised learning has garnered increasing attention in time series analysis for benefiting various downstream tasks and reducing reliance on labeled data. Despite its effectiveness, existing methods often struggle to comprehensively capture both long-term dynamic evolution and subtle local patterns in a unified manner. In this work, we propose TimeDART, a novel self-supervised time series pre-training framework that unifies two powerful generative paradigms to learn more transferable representations. Specifically, we first employ a causal Transformer encoder, accompanied by a patch-based embedding strategy, to model the evolving trends from left to right. Building on this global modeling, we further introduce a denoising diffusion process to capture fine-grained local patterns through forward diffusion and reverse denoising. Finally, we optimize the model in an autoregressive manner. As a result, TimeDART effectively accounts for both global and local sequence features in a coherent way.We conduct extensive experiments on public datasets for time series forecasting and classification. The experimental results demonstrate that TimeDART consistently outperforms previous compared methods, validating the effectiveness of our approach.Our code is available at https://github.com/Melmaphother/TimeDART.