Most of technology forecasting has been depended on the knowledge and experience of experts. It has caused many problems like inaccuracy as well as waste of time and cost. So, we need more objective and accuracy methods for technology forecasting. Recently, many researches of technology forecasting have been published. They were analyzed methods of patent data. In this study, the researchers propose a technology forecasting model using frequent time series analysis in bio-technology domain. The experimental data are patents of bio-technology. The researchers verify improved performance of frequent time series model in the experimental results.
INTRODUCTION
Hall et al. (2001) insisted that the patent data were better than others for technology forecasting. Also, Dernis et al. (2001) published the approach of time series analysis to forecast technology. In this study, we propose a method for technology forecasting using frequent time series model. Many experts have forecasted technology subjectively based on their experience and knowledge (Lee et al., 2009; Wang et al., 1998; Yoon and Park, 2007; Yoon and Lee, 2008) but we need more objective approach for efficient forecasting.
To settle this problem we consider frequent time series model based on patent data. To verify the method, we use the US patents about bio-technology (Vapnik, 1998).
Related researches: There were many researches in technology forecasting (Fattori et al., 2003; Feinerer et al., 2008; Lee et al., 2009; Wang et al., 1998; Yoon and Park, 2007; Yoon and Lee, 2008). They were depended on keywords from patent document using text mining (Fattori et al., 2003; Feinerer et al., 2008). Also, the methods about citation analysis were used in the study (Hall et al., 2001). These have had many contributions in technology forecasting (Lee et al., 2009). So, we can plan research and development (Rand D) processes by the results of technology forecasting. Generally R and D plan is very important in the government and company (Yoon and Park, 2007; Yoon and Lee, 2008). The plan is deeply involved with project cost. Also, we can avoid overlapping investment by efficient R and D plan. Many company and corporation have suffered from patent violation suits by other company or patent troll (Lee et al., 2009; Yoon and Park, 2007). Before researching and developing a technology, the researchers have to analyze the technology results so far achieved. One of these data is patent document. Patent data are very objective for technology forecasting. More accurate results are needed for objective forecasting of technology. But many forecasting processes have been depended on subjective knowledge of the domain experts. In the research, the reserarchers propose an objective approach to technology forecasting. In the study, a frequency time series method is used to analyze patent data for efficient forecasting of technology.
Frequency time series mode for bio-technology forecasting: Traditional methods of time series are focused on continuous data (Brockwell and Davis, 2002). ARIMA (Auto-Regressive Integrated Moving-Average) is a popular time series model (Tsay, 2005). But this method has a problem for frequent time series data such as patent frequent by year. In the study, researchers forecast a trend of technology using MA (Moving Average). Also, we compare this approach with linear regression, Poisson regression and SVR (Support Vector Regression).
Linear regression and poisson regression: Linear regression finds a dependence of one variable on another (Myers, 1989). Its functional form is defined as the following:
![]() |
(1) |
Where:
xi and yi | = | Independent and dependent variables, respectively |
α and β | = | The intercept and slope of the regression line |
ε | = | Error from Normal distribution with mean 0 and variance σε |
In the technology forecasting model, x and y are the time (year) and frequent of patent. The distribution of y is normal. This distribution has a continuous data. In the patent analysis, the frequent of patent is not continuous strictly speaking. So, we can another regression models fitted to discrete data type.
One of these models is poisson regression model. This consider y as discrete data (frequent). The distribution of y is Poisson in Poisson regression model. The researchers use these regression models to forecast the technology trend in the experiment.
Support vector regression: SVM (Support Vector Machine) is a non-linear model (Haykin, 1999; Vapnik, 1998). Firstly SVM were developed to apply to classification (Vapnik, 1998). Recently, it has solved the regression and clustering. SVM for regression is SVR and SVC (Support Vector Clustering) is the SVM for clustering (Burges, 1998; Haykin, 1999). Using an alternative loss function, SVR can show good performance in regression problem. Given data set, {(X1, y1), (X2, y2),....., (Xn, yn)} and a linear model, f (x) = w. X + b, the researchers can optimal regression function by the following minimum of the functional:
![]() |
(2) |
where, C is a constant, also and
are slack variables.
Moving average and central limit theorem: MA is a smoothing method in time series analysis (Brockwell and Davis, 2002; Tsay, 2005). Given a time series data set, (y↓ 1, y↓ 2,...., y↓ T), MA of point t computes Fm, the mean of the previous m points as the following formula:
![]() |
(3) |
Traditional MA needs continuous type as a time series data. But the frequent of patent is discrete. To solve this problem, we consider the CLT (Central Limit Theorem) (Casella and Berger, 2002). The patent frequent data are sum according to years (from 1981-1995). Therefore, we use MA to forecast the bio-technology.
RESULTS AND DISCUSSION
The researchers use four IPC (International Patent Classification) codes about typical bio-technology from patents of the US (Vapnik, 1998). Table 1 shows these codes. To verify the proposed mode, the researchers use the patents in biotechnology which are C12M, C12N, C12P and C12Q.
They are popular IPC codes relevant to biotechnology. The researchers got the patent data from USPTO (United States Patent and Trademark Office). The training data has the patent from 1981-1995.
The data from 1996-2000 are used for testing the new model. IPC has a hierarchical structure including section, class, sub-class, maingroup and sub-group. So, the patent data for the experiments are following sub-classes Table 2 and Fig. 1-4 show the trend of frequent of bio-technology.
The researchers found some intervention in 1994, 1995 and 1996 and the researchers can think some events occurred in these periods. In the experiment, linear regression, poisson regression, SVR and MA are considered as comparative models.
Table 1: | IPC codes and their subclasses |
![]() |
Table 2: | IPC codes and their sub-classes |
![]() |
![]() |
|
Fig. 1: | Trend of patent frequent (C12M) |
![]() |
|
Fig. 2: | Trend of patent frequent (C12N) |
![]() |
|
Fig. 3: | Trend of patent frequent (C12P) |
![]() |
|
Fig. 4: | Trend of patent frequent (C12Q) |
Firstly, we got a result for technology forecasting using linear regression. Table 3 shows regression parameters (b0-intercept, b1-slope of regression line), R2 and MSE (mean squared error) (Han and Kamber, 2001; Myers, 1989). The reserachers can find the performance of poisson regression model for bio technology forecasting in Table 4.
Where AIC (Akaike Information Criterion) is a criterion for measuring the performance of poisson regression method (Johnson, 1998; Myers, 1989). The researchers also use SVR with RBF (radial basis function) for bio-technology forecasting model in Table 5. Where no. of S.V. is the number of support vectors.
Table 3: | Result of linear regression |
![]() |
Table 4: | Result of Poisson regression |
![]() |
Table 5: | Result of SVR (RBF) |
![]() |
Table 6: | Result of MA |
![]() |
Table 7: | MSEs of comparative models |
![]() |
C is the regularization parameter (constant) and Gamma is parameter of RBF. Finally, we consider the MA method for forecasting model. According to m, the researchers got the results of MA in Table 6.
In the results of MA model, m = 2 showed best performance in all IPC codes. All MES values of comparative models are showed in Table 7. In the result of IPC code C12M, the researchers found the MSE value of poisson regression was smallest in the comparative methods.
On the other hand, the researchers could get the best MES of C12N, C12P and C12Q using MA (m = 2). The patterns of time series plots Fig. 1-4 can be classified into two groups. C12M is belonged to one group. Another group has C12N, C12P and C12Q. Generally, the researchers can expect the trend of bio-technology like the group including C12N, C12P and C12Q. So the researchers can know the MA model is effective to forecast technology.
CONCLUSION
Using MA and CLT, a technology forecasting model by frequent time series analysis was showed in this study. The researchers applied the model to bio-technology forecasting analysis. The experimental data were constructed using frequency by year in bio-technology patents. The researchers found the performance of MA was better than comparative methods which were linear regression, poisson regression and SVR. To get more advanced results, the researchers will new learning methods like hybrid neural networks and diverse time series models.
Sunghae Jun and Daiho Uhm. Technology Forecasting Using Frequency Time Series Model: Bio-Technology Patent Analysis.
DOI: https://doi.org/10.36478/jmmstat.2010.101.104
URL: https://www.makhillpublications.co/view-article/1994-5388/jmmstat.2010.101.104