Data center is a very hot topic at the moment, which can solve the problem of enterprises recreating wheels. Although data center has been practiced in Internet companies for many years, it is still a relatively new topic for traditional companies.
This article is selected from the book "Data Center Architecture: Best Practices for Enterprise Dataization". The author of this book, Kangaroo Cloud Partner and Senior Vice President Zhang Xu, and his team summarized the results of the implementation of multiple data center projects. A set of methodology is summarized-the five-step method of data center construction.
01
The first step: inventory and planning of data resources
The basis of digitization is the data generated by informatization or informatization. These data have the meaning of data, and at the same time, these data will enter the data framework system, and continue to produce more data and greater value through calculation. Therefore, the inventory of enterprise data resources is the prerequisite and foundation for data construction. A complete and accurate data resource is a powerful guarantee for subsequent data construction.
The inventory and planning of data resources need to achieve the following goals:
(1) Inventory and statistics of existing data resources.
(2) Plan the data resources that the enterprise can or should have.
(3) Build an inventory system and use necessary tools to ensure that the results of inventory can always be consistent with the real situation.
02
Step 2: Data application planning and design
Enterprises should carry out relatively complete data application planning based on existing technical conditions and plans. This step can answer the following questions.
What are the data needs in the enterprise
We need to sort out data requirements from business lines, business levels to the most fine-grained positions.
Which data applications should enterprises build
We need to conduct overall planning and design of data applications around data requirements.
In what order should these data applications be implemented
We need to establish an evaluation model for data applications. The evaluation dimensions include three main aspects: whether the data application can be realized, the business value of the data application, and the implementation cost of the data application. Through the evaluation results, we can determine the implementation path of the data application.
03
The third step: data asset construction
The construction of data assets must rely on the core products of the data center. Data assets are the key foundation of enterprise data construction. All data-based constructions are finally based on data assets, and are carried out around this foundation. Data assets will be the basic layer module that enterprises invest the most in the early stages of comprehensive data construction and have the slowest results. A large part of the various discussions, disputes and compromises between Taiwan and Taiwan are due to the huge, complex and high investment of this infrastructure.
The content of data asset construction includes the following aspects:
Technology construction
(1) Product selection. Product selection includes how to choose data center products, the functions that data center products should have, and technical parameter indicators.
(2) Technical architecture design. Technical architecture design includes how to deploy products in the data center, how to replace or parallel the traditional data warehouse, and how to extract current application data in the data center.
Standards and data warehouse model building
(1) Modeling and development specifications. Modeling and development specifications include the formulation of data warehouse model design specifications, the formulation of data development specifications, and how to avoid the current relatively common data development confusion and difficult operation and maintenance.
(2) Data modeling. Data modeling includes building a data warehouse model and submitting it for review.
Data extraction, data development, task monitoring and operation and maintenance
(1) Data extraction. Data extraction includes extracting data from the data resource layer into the ODS layer.
(2) Data development. Data development includes data task development, data cleaning, and data calculation.
(3) Task monitoring and operation and maintenance. Task monitoring and operation and maintenance include monitoring all data tasks, and necessary manual intervention and handling of abnormal and error tasks.
Data quality verification
Data quality verification includes verification and processing of currently discovered data quality issues, and promotes the development and continuous optimization of data governance.
Data application support
Data application support includes providing a support development platform for current data application development.
04
Step 4: Detailed design and implementation of data application
Whether using the waterfall model or the agile model, the design of data applications can generally follow the process and concepts of traditional information application design. Data development in data applications is generally completed in a database or data warehouse. The content display of data applications can be displayed using BI analysis tools, such as large-scale visualization or customized development applications. Data applications can also provide data results through API interface services for other external applications to call on demand. The development of data applications and the development of traditional information applications have the following differences.
Data applications focus on the content and quality of data sources
We should fully understand the current data source situation of the enterprise before implementing the data application, including the types of data, the specific attributes of each type of data, and the quality of the data content. Most of the failed data applications are caused by various problems in the data source, such as missing data or data quality issues.
Complex data development requires continuous tuning and iteration
With the introduction of algorithms such as machine learning and deep learning, data model construction methods are becoming more and more abundant. But under normal circumstances, the generation of final business value is a complex process that not only requires data support, but also management cooperation.
The result of data application has a high proportion of verification workload
Demonstrating the correctness of data results or evaluating the effect of data application is a time-consuming and laborious task. Even relatively simple index calculations often take more than 1/3 of the entire process to verify correctness. Even for many algorithm projects, it is necessary to build an outcome evaluation model in advance, and obtain the approval of Party A's enterprise before data development can begin.
The operation and maintenance of data applications is difficult
Because various abnormal situations in data are often unknowable or unexpected, data operation and maintenance requires strong manual guarantees to keep tasks running.
The results of data applications require operations
The completion of data application development is only the first step for data to be valued. How to make business departments understand the model and make good use of data is the key to follow-up. Especially when new data has just been introduced and business value has not yet been shown, companies need to conduct in-depth data operations.
05
Step 5: Data Organization Planning
Enterprise digitization should be a matter of enterprise strategic height in the future, and digitization requires an organization with the same strategic height to be responsible for its promotion. It is a good choice whether it is a transformation from a traditional IT department or the intervention of a strategic department or similar department. Organization is a core to ensure the smooth implementation of data center, and it is also the person who promotes the process of enterprise data.
Comments (0)
See all