A Systematic Review of Social Media Analytics Pipelines: Verification, Optimization, and Scalable Computing Perspectives
Main Article Content
Abstract
Social media analytics pipelines have become indispensable for extracting meaningful insights from large-scale, dynamic, and heterogeneous data generated on platforms such as Twitter, Facebook, and Instagram. These pipelines typically involve stages such as data collection, preprocessing, feature extraction, model training, and deployment. However, the increasing volume and velocity of social media data introduce significant challenges related to verification, scalability, and system optimization. This review synthesizes findings from multiple studies, emphasizing verification mechanisms, optimization strategies, and scalable computing architectures. Techniques such as data validation, anomaly detection, and model auditing play a crucial role in ensuring the reliability of analytics pipelines, while optimization approaches including parallel processing, edge computing, and adaptive learning enhance performance and efficiency. Scalable infrastructures such as cloud computing, distributed systems, and stream processing platforms support real-time analytics and large-scale deployment. Furthermore, the integration of Graph Neural Networks (GNNs) has significantly improved the modeling of relational data, particularly in detecting adversarial activities by capturing complex graph structures and identifying abnormal patterns. Despite these advancements, challenges such as computational complexity, scalability constraints, and robustness against sophisticated attacks remain, indicating important directions for future research.
Article Details

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.