基于数据挖掘技术的中国上市公司财务危机预警分析/

2019-04-08 19:20:00

frequency bond size dependent phonon



基于数据挖掘技术的中国上市公司财务危机预警分析

指导教师:徐立本 教授
专业名称: 数量经济学

摘 要
本文全面系统地分析了中国上市公司财务危机的现状和具体表现形式,并分析了中国上市公司财务危机产生的原因及防范措施。建立财务危机预警模型对于预测和防范上市公司财务危机非常必要。本文对现有的财务危机预警模型进行了综合的评析,并利用SAS软件的数据挖掘模块,构建了多个用于财务危机预警的模型,与现有的判别分析模型、Logistic模型和神经网络模型相比在预测的时效性、样本数量、财务指标选取和数据挖掘技术的使用上都有很大提高,取得了很好的预警效果。

一、当前中国上市公司的财务危机问题
国内外对财务危机所做的定义多种多样,多数是从企业发生财务危机时的具体表现形式角度来界定财务危机的概念,如破产、拖欠优先股股利、无力偿付债务等。一旦发生其中某种现象,就认为企业发生了财务危机。
财务危机的实质就是财务风险规模化、高强度化的集中爆发,主要表现为财务状况极度恶化,发生支付危机,甚至破产倒闭。
中国上市公司的财务现状令人担忧。陷入财务危机的上市公司都不同程度地出现下述情况:无力偿还到期债务,且无债务重整计划,现金流量入不敷出,现金支付严重不足,巨额投资无收益,产品销路不畅,存货大量积压,主营业务发生严重萎缩,企业利润依赖关联方交易及外来补贴,涉及巨额诉讼赔偿等。

二、中国上市公司财务危机问题形成的原因及防范措施
中国上市公司财务危机形成的原因包括国家宏观管理和宏观经济环境因素、证券市场缺陷因素和中国上市公司的治理结构因素等等。
(一)国家宏观管理因素和宏观经济环境因素
我国在经济和会计法律制度建设方面还不够完善,在对企业进行必要的管理和监督时缺少法律依据。国家在一定时期内所采取的货币金融政策和产业政策可能引发一些相关的企业发生财务危机。宏观经济的萧条与繁荣对企业有重要影响。
在这方面国家要完善管理和监督制度,企业也要根据国家政策和宏观经济环境的变化积极采取应对措施。
(二)证券市场缺陷因素
证券市场信息有效性差,股票上市制度缺陷造成上市公司虚假包装上市,公司收购兼并功能失灵。证券市场缺陷因素使得公司的利益各方没有尽全力为企业的价值最大化而努力。
完善我国的证券市场需要做如下几项工作:严把上市公司入口关,加强上市公司信息披露,建立健全法律法规,建立市场化、制度化的市场退出机制,大力发展机构投资者。
(三)上市公司的治理结构因素
中国上市公司财务危机形成的一个根本原因在于上市公司的治理结构缺陷。而公司内部股权结构,会直接影响治理结构的形成、运作及其绩效。不同的股权结构形成了不同的公司治理结构。我国上市公司股权结构的缺点突出地表现在以下几个方面:国有股持股主体具有行政化倾向,国有股一股独大导致大股东控制与大股东掠夺,公司内部人控制现象严重,董事会运作不够规范,经理层缺乏激励和约束机制。股权结构不合理,严重影响和损害了公司价值和股东利益,是我国上市公司财务危机产生的重要原因。
要从以下几个方面完善我国上市公司的股权结构:减持国有股,建立独立董事制度,建立国有企业经营者股票期权激励机制。
建立财务危机预警模型能够对企业的经营失败、财务管理失误现象进行早期警报和早期控制,能为决策者、投资者、债权人提供重要的信息。建立财务危机预警模型系统对于证券市场来说是非常必要的工作,也是迫切需要解决的问题。
三、数据挖掘技术概述
(一)数据挖掘的定义
现有的财务危机预警模型大多是基于数据挖掘技术的。数据挖掘就是从大量的、不完全的、有噪声的、模糊的、随机的实际数据中,提取出隐含在其中的、人们不知道的、但又是潜在有价值的信息,并将这些信息归纳成结构模型的过程。数据挖掘侧重于预测未来的情况,并解释过去所发生的事实的原因。数据挖掘是一门交叉学科,是一种结合数种专业技术的应用。数据挖掘的任务包括概念描述、关联分析、分类和预测、聚类和异类。
(二)数据挖掘技术的分类
常见的数据挖掘技术可以分成统计分析类、知识发现类和其他类型三大类。
1.统计分析类数据挖掘模型所使用的统计技术有回归分析、时间序列分析、多元分析等。许多数据挖掘工具都是基于统计技术构造的。统计作为数据挖掘的一种技术是成功的,其原因在于统计技术是对同样类型的问题在同样情况下的应用。
2.知识发现类数据挖掘技术主要包括基于实例的推理,神经网络,决策树,专家系统,概率规则,基于知识的随机模型,模糊逻辑,进化计算等。
数据挖掘技术的应用的过程一般需要经历确定挖掘对象、数据准备、建立模型、数据挖掘、结果分析与知识应用这样几个阶段。

四、基于统计类数据挖掘技术的财务危机预警模型
判别分析是对研究对象所属的类别进行判别的一种统计分析方法,早期的财务危机预警模型用的就是判别分析的方法。包括单变量模型和多变量模型:
(一)单变量模型
单变量模型通过个别财务比率走势来预测财务危机。单变量分析尽管有效,但存在一定的局限性。
(二)多变量模型
多变量模型是运用多变量模式思路建立多元函数公式,即把企业的多种财务指标分别加权汇总产生企业的总判别分来预测企业发生财务危机的可能性。最早运用多变量分析法探讨公司财务危机预警问题的是美国学者Altman(1968)建立的Z分数模型(Z-score model),这是在财务危机预警问题上非常有影响的模型。以后建立的日本开发银行模型和F分数模型等都是以Z分数模型为基础建立的。
(三)判别分析预警模型的缺陷
判别分析预警模型有两个不可克服的缺陷,分别是:
1.固定影响问题。任何解释变量的边际效应是固定不变的。这就是线性函数形式中必然存在的固定影响假设问题,而这个假设是与现实不符的。
2.完全线性补偿问题, 任何线性函数都存在这个问题。
为克服判别分析模型的缺陷,人们用Logistic 回归方法建立财务危机预警模型,很好地解决了问题。logistic 方法的引入解决了线性函数存在的固定影响假设问题。为了解决完全线性补偿问题,有人在Logistic函数中引入二次项和交叉项, 引入带有二次项和交叉项的Logistic模型能够提高分类能力。

五、运用SAS软件,构建基于数据挖掘技术的财务危机预警模型
本文选择沪市A股的157家工业类上市公司作为样本。根据五个时点的财务报告分别计算出43项财务指标。
把数据输入到SAS软件中,运用SAS软件进行分析处理。本文建立了如下几个模型。
(一)财务状况综合评分模型
单个指标只能反映企业财务状况的不同侧面,不能全面反映企业的全貌,根据单个指标或被定为“特别处理”来判断公司发生财务危机具有很大的局限性。主成分分析法是一种降维的统计方法,它可以用尽量少的综合指标代替众多的原始数据,并尽可能多地反映原始数据所提供的信息。本章采用主成分分析方法建立上市公司财务状况综合评价体系,对每家上市公司给出财务状况综合评分。这种评价体系比只根据某项财务指标或被证监会定为”特别处理”来评价上市公司的财务状况要严谨和科学。
根据主成分分析的特征向量和主成分的贡献率计算出对于财务状况影响最大的19项财务指标。
从五个季度的综合评分情况看,五个季度的综合评分小于零的公司的比例都接近一半,显然不能认为综合评分小于零的公司都发生了财务危机,但从财务预警的角度来说这些公司是值得关注的。
(二)Logistic回归和决策树模型
根据主成分分析的结果,把预测期上市公司的财务状况分为两类,作为预测的目标变量,把对于财务状况影响最大的19项财务指标做为输入变量。运用Logistic回归方法和决策树方法进行上市公司财务状况预测。
决策树是以实例为基础的归纳算法,能够从一组无次序、无规则的事例中推理出决策树表示形式的分类规则。本文采用了常用的三种决策树算法。
可以看出三种决策树算法的误差率相差很小,三种方法都是可以接受的。
决策树方法使我们可以根据少数几项财务指标快速地判断一家公司的财务状况,经过判定后公司被划为财务状况不良类并不意味着公司一定发生了财务危机,但这样的公司值得进一步地了解和分析,这就是决策树预警模型的意义。
决策树方法的误差小于Logistic模型的误差,预测效果比Logistic模型好,而且非常的简单实用。
(三)线性回归和神经网络模型
神经网络是生理学上的真实人脑神经网络的结构和功能以及若干基本特征的某种理论抽象、简化和模拟而构成的一种信息处理系统。
线性回归和神经网络财务危机预警模型把每家公司的综合评分作为目标变量,把对于财务状况影响最大的19项财务指标做为输入变量,运用线性回归方法和神经网络方法进行上市公司财务危机预警分析。
线性回归模型简单有效,在对结果的准确性要求不高的情况下可以对公司的财务状况进行快速地判断。
神经网络模型对于综合评分的预测效果比线性回归的预测效果好,结果准确,具有可信度。

六、本文的创新点在于
1.运用主成分分析方法,建立包含43项财务指标的上市公司财务状况综合评分体系。与现有的基于主成分分析方法的财务状况评分体系相比,采用的财务指标数量多而且全面,消除了人为因素的影响,对财务状况的评价更为科学准确。
2.利用主成分分析的特征向量矩阵和主成分的贡献率,计算出对公司财务状况最有影响的19项财务指标。这19项财务指标涵盖了公司财务状况的七个主要方面,而且几乎包括所有的现金流量和公司成长方面的财务指标,这19项指标可以很好地预测公司未来的财务状况。
3.运用前述的19项指标建立了Logistic回归和线性回归财务危机预警模型,预测的准确性高于现有的Logistic回归和线性回归财务危机预警模型。
4.运用决策树方法进行财务危机预警分析,并给出了运用三种决策树算法得出的三套判定规则。决策树规则可以作为快速判断公司财务状况的方法,简单实用。
5.建立了一个神经网络财务危机预警模型。此模型与现有的神经网络财务危机预警模型相比有很大改进,首先现有的模型中输入变量是通过人为选择的,而此模型选取的输入变量为前述的19项财务指标,消除了人为因素的影响,更加准确科学;其次现有的模型只是把未来的财务状况进行了简单的分类,而此模型的目标变量为上市公司的财务状况综合评分,对于上市公司未来财务状况的预测更客观。



Forewarning Analysis of Chinese Listed Corporations’
Financial Crisis Based on Data Mining Techniques


Tutor: Xu Liben
Major: Quantitative Economics


Abstract
The article systematically analyzed the actuality and modality of Chinese listed corporations’ financial crisis, and discussed the reason of this problem appeared and the method how to resolve. The necessities of establish forewarning model in forecasting and avoiding financial crisis were clarified; Financial crisis forewarning models in existence were reviewed; several forewarning models using Enterprise Miner module in SAS were established. Compared with the existing models, such as Discriminant analysis models, Logistic models and Neural Network models, the models in this article possessed many advantages and could be used more perfectly on forewarning purpose.

1. Current matters of Chinese listed corporations’ financial crisis
There are several definitions on financial crisis in China and abroad, most of them give this concept from modality of financial crisis, such as bankrupt, default preference stock dividend and incapacity to liquidate debt. Corporations are considered plunged in financial crisis in case one of above phenomenon occurred.
The quiddity of financial crisis is an violent eruption of financial risks with the characters of extremely worsen financial status, payment crisis and even bankruptcy.
Financial status of Chinese listed corporations should be concerned . Such matters as blow might appeared in some degree if the financial crisis happened: incapacity repaying debt, excessive cashing out, lacking cash payment, mint investment but no income, product no market, merchandise keeping long in stock, main operation severely shrinking, involved in lawsuit payment, etc.

2.The reason of Chinese listed corporations’ financial crisis coming into being and keeping away measure
The reason of Chinese listed corporations’ financial crisis happened included such factors as nation’s macro management and macro-economy factor, security market limitation factor and corporate governance factor.
2.1 Nation’s macro management and macro economy factor
Chinese economic and account law system were imperfect and lacking of law basis in the process of necessary management and supervising. Sometimes nation’s financial policies and domain policies were even likely to solicit financial crisis in related corporations. Depression and prosper of Macro economy would influence the status of corporations greatly.
In conclusion, management and supervise system of nation should be perfected and the active measures should also be taken by corporations according to the variation of nation’s policy and macro-economical environment.
2.2 Security market limitation factor
Poor efficiency of the information in security market and the defect in the process of stock coming into the market resulted in many troubles such as falsehood action of listed corporations and failure of annex function between corporations. These limitation factors in security market resulted that neither sides of the interest groups done their best to increase corporation’s value.
Some measures to perfect Chinese security market should be taken as follows: perfecting stock coming system, establishing and perfecting stock law, setting up the exiting mechanism based on the variation of market and the system, taking efforts to develop institution investor.
2.3 Corporate governance factor of listed corporations’
Defect of corporation governance factor is one of fundamental reasons in Chinese listed corporation financial crisis. The structure of corporate ownership affected the formation, the operation and the performance of corporate governance. Differ corporate ownership structure cause differ corporate governance. The shortcomings of Chinese listed corporations’ ownership structure were presented as follows: administrative decline possessed by national shares holders, excessive proportion of national shares which causes the biggest shareholders controlled the corporations, nonstandard operation of directorate, lacking of inspiring and restricting mechanism. It’s the main reason of Chinese listed corporations’ financial crisis that the irrationality of the ownership structure affected the value of corporation and the shareholders’ benefit.
Some tasks should be done to perfect Chinese listed corporations’ ownership structure: reducing the proportion of national shares, setting up independent director system, establishing manager stock future system.
Financial crisis forewarning models could forecast and control such matters as management failure and financial misplaying, and it also provide important informations to decision-maker, investor and debtee. It’s necessary to establish financial crisis forewarning models to perfect stock market.


3. Summarize of Data Mining Techniques
3.1 Definition of Data Mining
Most financial crisis forewarning models are based on Data Mining techniques. Data Mining can pick-up concealed, unknown but valuable information from those abundant, imperfect, noise, fuzzy and random data, and conclude this data into structure models. Data Mining emphasizes particularly on the future instance and the reason of those happened . Data Mining is a cross subject, so it’s combination of several speciality techniques. The task of Data Mining includes description, association analysis, classification and prediction, clustering and outlier.
3.2 The classification of Data Mining techniques
Data Mining techniques can be classified into Stat. Analysis Data Mining techniques, knowledge discovery Data Mining techniques and other Data Mining techniques.
(1) Stat. analysis Data Mining techniques include regression analysis, time-series analysis and multianalysis, etc. Many Data Mining models are established based on Stat. Analysis. Stat. Analysis is a success Data Mining technique.
(2) Knowledge discovery Data Mining techniques include instance-based learning, Neural Network, decision tree, expert system, probability rule, fuzzy logic and evolutionary Computation, ete.
Application process of Data Mining include confirming object, data preparation, founding model, data mining, analyzing result and knowledge application.

4. Financial crisis forewarning models based on Stat. Analysis Data Mining techniques.
Discriminant analysis distinguished which sort of the research objects belonged to, and it was used in all of the early financial crisis forewarning models. Early financial crisis forewarning models include single variable models and multi-variable models.
4.1 Single variable models
Single variable models forecasted financial crisis through analyzing single financial ratios’ trend, it was effect in some way but existed limitations.
4.2 Multi-variable models
Multi-variable models are multi member functions, that is adding powered multi-kind financial ratios to produce overall score for forecasting the possibility of financial crisis. The first multi-variable models for forecasting financial crisis is Z-score model established by American scholar, Altman(1968). It’s a very influenced model in financial crisis forewarning. Exploiture bank of Japan model and F-score model are all based on Z-score model.
4.3 The limitation of discriminant analzing financial crisis forewarning models
There are two unconquerable limitation in Discriminant analysis financial crisis forewarning models:
(1)Fixed influence. Boundary effect of any explaining variables should be fixed. This problem existed in linear functions, but it’s unconformity to reality.
(2) Entirely linear compensate matter. This matter exist in any linear functions.
In order to overcome these limitations, logistic regression models were established and solved this problem.

5.Using SAS software, establish financial crisis forewarning models based on Data Mining techniques
157 industrial corporations in Shanghai Stock Security Market were selected and 43 financial indexes were calculated according to the financial reports on five time spots.
Data was input to SAS and analyzed, several models were established .
5.1 Financial status integration scoring model
Single index only reflect the side face of financial status but not panorama, so it’s limited to judge those corporations befalling in financial crisis depending only on single index or those corporations in “Special Treatment”.
Principal Component Analysis is one of reduced dimension Stat. By using the lest integration indexes, it can reflect the most information provided by original data. In this article, integration evaluation system of listed corporations’ financial status was established using Principal Component Analysis, and every corporations’ financial status integration score was presented. This evaluation system is more precise than those system evaluating listed corporations’ financial status by certain financial index or standard that weather the corporation is in Special Treatment.
The most influenced 19 financial indexes were confirmed by Principal Component Analysis’ characteristic vector and principal components’ contribution ratios.
Five quarters integration scores shows that nearly half of the corporations’ integration scores were negative in either quarters. A corporation has a negative integration score did not means definitely that the corporation was in financial crisis, but it should be taken more attention.
5.2 Logistic Regression and Decision Tree Model
Two sorts of listed corporations’ financial status in forecasting spot were marked based on the results of Principal Component Analysis. These two sorts were taken as forecasting target variables, and the 19 financial indexes were taken as input variables. Logistic Regression and Decision Tree were used to forecast listed corporations’ financial status.
Decision Tree is a inductive arithmetic based on instance, and it can deduced classification rules from a sort of cases without order and regulation. Three kinds of Decision Tree arithmetic were adopted in this article.
Three kinds of arithmetic were all creditable because little difference of error rates could be found among them.
Decision Tree arithmetic is a favorable method because it estimate listed corporations’ financial status expeditiously according to a few of financial index. Those corporations might not in the status of financial crisis definitely even though they were labeled poor financial status, it’s necessary to analyze further.
In conclusion,the error rate of Decision Tree was less than that of Logistic, and the forecasting effect of Decision Tree was better than that of Logistic.
5.3 Linear Regression and Neural Network Model
Neural Network is a information operation system that can imitate brain’s structure and function.
In Linear Regression and Neural Network models, every corporations’ integration score was taken as target variables and the 19 financial indexes were taken as input variables.
Linear Regression model is simple and efficient, and it can estimate financial status rapidly.
The forecasting effect of Neural Network is better than that of Linear Regression.

6. Innovation of This Article
6.1 A integration score system of listed corporations’ financial status was established using Principal Component Analysis. Compared with the existing systems, the system adopted more financial indexes and was more nicety.
6.2 The most influenced 19 financial indexes were defined using characteristic vector matrix and Principal Components’ contribution ratios in Principal Component analysis. The 19 financial indexes involved the main seven aspects of listed corporations’ financial status, and included almost all indexes relating to cash flux and corporation grown.
6.3 Logistic Regression and Linear Regression forewarning model of financial crisis were set up using the 19 financial indexes. The veracity of the models was better than that of the existing models.
6.4 Decision Tree was used to forewarn financial crisis, and three sorts of decision rule were defined using three decision tree arithmetic. This decision tree rules could be used conveniently in judging corporations’ financial status.
6.5 A Neural Network model was established. This model is advanced to the existing models. The existing models’ input variables were chosen by subjective method, by contrast , the 19 financial indexes were taken as input variables objectively in this article. On the other hand, the existing models only distinguished the future financial status between good and bad, but the model in the article forecasted financial status’ integrated scores.