微信版 移动版

社会学方法

化零为整的宏观社会数据生成:基于潜变量模型和动态贝叶斯方法

2024-12-12 作者: 张高祥,陈哲,陈云松

【作者简介】张高祥 南京大学社会学院;陈哲 南京大学社会学院;陈云松 南京大学社会学院

【文章来源】《社会》2024年第3期

【内容提要】对因果机制和对宏观检验的探寻催生了定量社会学研究对区群层面数据的需求,然而这类高质量的追踪数据资源相对稀缺。传统研究通常通过综合多个来源的个体社会调查数据来构建面板数据集以改善宏观数据匮乏现状,但其亦受制于社会调查在时间和空间分布上的稀疏性以及不同调查间的差异性。本文引介了一种可用于生成区群层面跨时空面板数据的动态贝叶斯潜变量建模框架,并通过应用实例展示了该方法的具体应用过程,比较了动态贝叶斯方法相较于几种常用的缺失值插补方法的优势。本文的示例结果表明,动态贝叶斯潜变量模型在跨时空、多维度的信息整合和参数不确定性探索方面具有重要的优势,可以实现对调查数据缺失年份或地区的估计和插补,大大缓解了社会学研究中面板数据不足的问题。

【关 键 词】数据生成, 维度整合, 潜变量, 贝叶斯项目反应模型, 动态线性模型

【全文链接】https://www.society.shu.edu.cn/CN/Y2024/V44/I3/173


Generating Macro-Level Data Using Latent Variable Modeling and Dynamic Bayesian Methods

Abstract: In contemporary quantitative sociological research,the testing of causal mechanisms and macro theories has driven researchers’ need for high-quality time-series data at the district cluster level. However,sociological research suffers from significant shortcomings in accessing large-scale,long time-span tracking data compared to fields such as economics. While the aggregation of individual social survey data from multiple sources to generate panel data is an important way to improve data scarcity,it is also constrained by the limitations of the spatial and temporal distribution of social surveys and the variability across surveys. In this paper,we introduce a dynamic Bayesian latent variable modeling framework designed to facilitate the generation of complete panel data at the cluster level. The implementation of this framework is demonstrated through a practical example,and its efficacy is highlighted in comparison to several common missing data imputation techniques. The results show that the dynamic Bayesian latent variable model has noticeable advantages in terms of temporal-spatial imputation,multi-dimensional social index integration,and even the inclusion of parameter uncertainty. This method has potential in the estimating and imputing missing data for years and regions within surveys,yielding a clear picture of its future appliance in panel data generation and dimension integration for macro-level sociological research. However,the practical application of this approach still faces certain limitations,such as data availability,“synonym repetition”,and insufficient sensitivity to drastic changes. In view of this,this paper proposes corresponding optimization strategies to enhance the applicability and flexibility of this modeling framework,thereby expanding its application scope in the field of social sciences. The research in this paper provides valuable insights for practical application of the dynamic Bayesian latent variable modeling approach,offering inspiration for future related studies.

Key words: data generation, dimension integration, latent variables, Bayesian item response theory model, dynamic linear model

0
热门文章 HOT NEWS