With the deep integration of the Internet and business, mankind has entered the era of intelligent commerce. In this new era of disruptiveness, most of the decisions that were made by people in the past will become data-driven, precise decisions. For consumer goods companies, through the analysis of membership data and transaction data, it is possible to accurately form customer’s portrait and make accurate predictions about their behavior, which is of great significance to the brand strategy and performance improvement of enterprises. Therefore, data analysis capabilities have become the most important strategic capabilities of brand enterprises. But now, many traditional businesses, lacking data talent and experience in data analytics, are slow to move forward in digital transformation and have doubts about the business value of data analytics. Our analysis of membership data from a fashion brand company demonstrates the value of data analysis through effective business insights and business decision-making processes.
Most consumer companies now have their own membership databases, which record customer information, such as address, age and contact information, and membership data is mainly used to invite customers to achieve repeated consumption and relationship maintenance. Very few companies have effectively analyzed this data to guide their business.
In the big data analysis project in this paper, through an effective mix of internal and external business experts and data experts, the project team has come up with a number of important business insights and developed very valuable recommendations, in strict accordance with the pre-defined big data analysis project implementation process (see the follow-up article). This analysis helps the enterprise more effectively recognize customers and provide more targeted products and services to give consumers more value. The analysis also helps the company find more effective ways of member development and regional expansion strategies. Such “small analysis and large profit” projects can also strengthen the confidence of more traditional enterprises in digital transformation, and contribute to promoting the digital transformation of Chinese enterprises and fostering new momentum for China’s economic development.
The data for this article comes from F Company (F represents the name of the company). F was founded in Hong Kong in 1979. The current business is mainly in the mainland, with a small number of stores in Hong Kong and Singapore. F is mainly engaged in high-grade fashion leather bags research and development, design, production and sales. The core business is the operation of fashion women’s bag brand, in the continuous development to draw the essence of the foreign and Hong Kong, Macao and Taiwan region bag industry, combined with the actual situation at China, in practice gradually formed their own unique style.
This article uses only membership data, which does not include the data of daily transactions. Based on the analysis of membership data, such as the speed and quality of membership development, membership card opening channels and age distribution, membership age trends, members of the geographical distribution features, we make 8 business insights. Based on these important business insights, we advise on the company’s future membership development strategy, product development strategy, and geographic strategy (totally 12 suggestions) to help the company achieve high-quality growth.
The article is organized by data cleaning, data visualization, data classification, correlation analysis and data prediction, based on the description of data and features and dimensions selections. In the algorithm and analysis section, we wish to demonstrate the results of the analysis, and give some business insights and business recommendations based on the results and the systematic analysis process. And an outlook for future researching would be also given at the end.
2. Description of Data
The member management database of F is very traditional and contains 23 features, see Table 1. The database includes a total of 1,011,538 records which are from 2015 to 2019.
Preliminary analysis shows that this member data is a typical structured and incomplete data, mainly manifested as a large number of data loss and data errors. We take “Member Birthday” and “Member Age” data as an example, membership birthday and age data valid number of 443,081, accounting for 43.8%; Thus, there is more invalid data in this database, and if all the abnormal data is deleted, it will have an impact on the accuracy of the whole data analysis. Therefore, we use the dynamic data cleaning method, that is, when doing analysis of each dimension, only delete outliers of the analyzed dimension.
3. Selection of Features and Dimensions
In the early stage of data analysis, we visualized almost all the features, and through the preliminary analysis of the display data, we selected the following 8 features and 7 dimensions for analysis, see Table 2 and Table 3 respectively.
Table 1. Data features of the member management database.
Table 2. List of the 8 features selected.
Table 3. Seven analysis dimensions selected.
4. Algorithms and Analysis
The entire analysis process    is in Python language   and is based on membership data provided by the F brand. In order to analyze more effectively, some external data, such as urban population data, disposable income, etc., are also imported, which comes from the network. Analysis for each dimension includes three parts: data analysis, business insights, and business recommendations.
4.1. Card Opening Trend Analysis
The number of annual membership cards opened, as well as the proportion of members who generate consumption in the card-opening membership, can measure the speed and quality of member development. Therefore, we have selected two analysis angles, one angle is the number of cards opened each year, and through the fit of the data, forecast the number of 2019, the second angle is the annual card opening members so far to achieve the proportion of sales conversion, to measure the quality of member development.
1) Number of cards opened per year
Limit the “card opening date” to the last five years from 2015 to 2019 and filter out valid data. I extract “Card opening year” from the feature “card opening date” and make data cleaning and data visualization, as shown in Figure 1.
2) Analysis of the growth of membership each year
I filter valid data and use the number of cards opened in 2015-2018 to make a forecast for the number of cards opened in 2019. See Figure 2.
3) Zero consumption per year as a bar chart
Filter the data and limit the year to 2015-2019, using the features of “card opening year” and “total historical consumption”. Based on the “total historical consumption” feature, establish a new feature “whether to consume”. Visualize the results as a stacked chart. See Figure 3.
Business Insight I: The number of cards opened is facing a bottleneck of growth and a breakthrough is urgently needed
China’s consumer Internet has experienced 20 years of rapid development; consumer behavior has produced great changes. In the pre-Internet era, people
Figure 1. The number of new cards opened each year.
Figure 2. Fitted curve for the number of new members.
Figure 3. Percentage of zero consumption members among new card-opening members.
go shopping in malls or supermarkets, this is a typical offline experience consumption, with a certain degree of randomness. But in the Internet era, consumer behavior based on search or social on internet, a brand without fans and members will not survive. Therefore, the great changes in consumer behavior make the member become the most important strategic asset of the brand, which is very important to the success of the brand.
Through the analysis of F brand’s membership data, we can find that the brand membership base is small, in 2015, there were only 19,534 members. As can be seen from Figure 1, the number of members is growing rapidly. If the current growth momentum can be maintained, nearly 520,000 new members will be added in 2019, according to the Growth Curve in Figure 2. However, 111,376 additional members were added in the first five months of 2019. This means that the trend in 2019 will break the trend of rapid growth.
This is a clear warning signal for the brand, which means that the way members developed over the past five years has become unsustainable.
Business Insight II: Zero consumption membership ratio remains high
Figure 3 shows that the proportion of zero consumption has remained high from 2016 to 2018, exceeding 50% of the number of new members each year. This shows that the effectiveness of card opening in recent years is insufficient, after investigation, we considered that the causes of this problem should be the following two. First, there are problems with the way the brand’s members developed. Offline channel development members, in order to complete the membership indicators, many shop assistants absorbed relatives and friends as members, most of these people are non-target customers, naturally difficult to generate consumption. Second, there is a problem with the activation method or incentive mechanism of the member. For those who have registered as members, through what way to accurately understand their needs, and through effective activities and incentives, to attract them to the store or on the network consumption, which is very important. From the current high proportion of zero consumer membership, we know that such activities should be very inefficient.
Suggestions I: Explore new ways to develop membership
Existing ways of member development, whether offline rely on the development of shop assistants, or the use of e-commerce platform development, have encountered a bottleneck of growth. We should actively explore new brand marketing and promotion methods to increase the attractiveness of the brand. Such as in the core market center to hold a large-scale promotional activities to attract offline people to register members, or we can use “Small Red Books” TikTok and other new social platforms to attract consumer attention. In addition, through online and offline collaboration, to create a huge impact of the hot products, increase the attraction of new members .
Suggestion II: Develop members in cooperation with professional brand organizations
In the Internet age, the way of enterprise development has undergone a great change, and the traditional development model is the organic growth of connotation, in order to gradually establish its own unique competitive advantage. In the Internet age, enterprises should quickly aggregate all kinds of resources to form a mutually beneficial and win-win business ecology to build ecological advantages. Now, China has formed the world’s most mature Internet business ecology, there are a large number of agencies specialized uses of the Internet for brand communication, some of them even have nearly 10 million small opinion leaders, affecting more than a billion fans. Such a communication organization, with a huge ability to promote, but also the ability to promote accurately. If F brand can work with such institutions, it can completely break the bottleneck of membership growth. For F brand, there should be more than 10 million active members within 3 - 5 years in the future, and that is the foundation of brand success.
Suggestion III: Change the incentive mechanism for developing members
Our investigation found that the brand’s current incentive to develop members is based on the number of members that the store staff develops each day. Such incentives lead to a lot of “pseudo-members” who will never consume. We recommend that the clerk should be assessed not only by the number of development members, but also by the number of members converted to consumers.
4.2. Analysis of Member’s Consumption
Although there is no specific consumption record per time in the database, the system records the number of times consumed. For the analysis of the times consumed, the loyalty of the member to the brand can be effectively evaluated. In addition, the time analysis of the last consumer’s consumption is also very important to understand the recent situation of members. Through analysis, we can find out how many members are dormant and there is a risk of loss.
1) The number of times consumed proportional pie chart
Filter the data, “Number of purchases during statistics” as the feature of this analysis, do not need to establish new features, calculate the corresponding number of each segment. The result is shown in Figure 4.
2) Last purchasing date analysis
Extracting a new feature “time from last consumption” from the feature “last consumption date”, divides the data into four intervals, the number of members in each of the four intervals is calculated. See Figure 5.
Business Insight III: Poor brand loyalty, loyal members accounted for a seriously low
More than 60% of the members did not generate any spending, which was also mentioned in the analysis of zero-consumption members who are not a real brand user, not a valid member. The proportion of consumption once 26%, this part of the members has not yet formed a brand loyalty. Consuming 2 to 10 times, accounting for about 12 percent, which are loyal consumers of the brand. More than 10 times, is the hardcore fans of the brand, who almost spend all the
Figure 4. Analysis of the number of members’ consumption.
Figure 5. Last purchasing date chart.
money into the brand. The hardcore fans represent 0.211 per cent.
Business Insight IV: Dormant members are too high, too few active members
Figure 5 can be found that the last purchase within a year, accounted for 23%, they are active members. More than one year accounted for 15%, they are at risk of loss. More than two years accounted for 62%, these are basically dormant members, and the possibility of loss is greater. From the analysis of data, it is not difficult to see that the current membership activity is very low; there is a risk of losing a large number of members.
Suggestion IV: Conduct a systematic survey of members who only spend once and listen to their feedback
Members who spend only once and do not continue to buy after using the product often have a real awareness of the product and the brand. Knowing why they refuse to repeat their purchases can help us develop our products better.
Suggestion V: Create a normal personalized member management system
Brand companies typically regularly analyze the activity of their members and activate customers who are on the edge of dormant through temporary activities. The effect of such activities is often limited. It is recommended to adopt more advanced CRM management software, accurate portrait and management of each consumer, and develop personalized member service programs, according to customer’s consumption needs, accurate push products or services, such as tracking the purchase time of each product, regularly invite members back to the store for maintenance. This can draw on the service of the 4S car store, which tracks each car and regularly invites members to maintain the car. From periodic maintenance to personalized and precise maintenance, is the only way for member management system.
4.3. Trend Analysis of Members Age Change
By analyzing the average age of each year’s card-opening members, we can get the trend of the age of the members. We chose two analytical perspectives. The first is the average age of the annual card-opening members, and the second is the distribution of members of different age groups who open cards each year.
1) The average age of the card-opening members each year
This analysis limits the age range to between 18-80, limits the card opening date to 2015-2019, “member age” and “card opening date” as the features of this analysis, and separates the “card opening year” as the new feature on the basis of the “card opening date” feature. The result is shown as Figure 6.
2) Age distribution for the annual card opening member
Limit the value of the “member age” feature to 18 - 80, and limit the time of open the card to 2015-2019. The features used in this analysis are “member age”, “card opening date”, the new feature “card opening year” is extracted from the “card opening date” feature, and the new feature “age at the time of card opening” is calculated on the basis of “member age” and “card opening year”. The data is divided into five groups. See Figure 7.
Business Insight V: Brands show a more obvious tendency to youth
Looking at the average age of card openers each year, the average age in the three years 2015-2017 was around 37, with the average age plummeting to 33 in 2018 and returning to 35 in 2019.
From the annual card opener age distribution, the proportion of young members overall increased, and the corresponding proportion of middle-aged and elderly members in the decline. It can also be seen that the brand to promote the
Figure 6. Average age trend chart for open-card members.
Figure 7. Age distribution of card-opening members in different years.
youth strategy has received some results.
Suggestion VI: Rebranding
With the trend of brand youth, products, channels, communication methods, etc., should be disruptive changed to meet the needs of young consumers. In a few years ago F brand to seize the development of e-commerce, led to a wave of growth, the next few years, should be the market for social e-commerce; F brand is facing new development opportunities. It is suggested that F brand, set up a special business team for young consumers, from products, value positioning, channels, promotion, etc., to create a new business model, remodeling a brand based on young consumers. This is the second curve as Charles Handy said .
Suggestion VII: Rethinking the needs of older members
At present, the consuming power of young members is strong. Many brands have recognized this trend, and reframed the brand for young. The strategy itself is justified. However, this does not mean that middle-aged or even older members on the shrinking demand, no value of services. On the contrary, they often enter a more rational stage of consumption, strong payment capacity, the maintenance of those guests is also very important. If we just promote the younger, resulting in the loss of these old members, will bring a lot of losses to the enterprise. However, due to the obvious difference needs of members of different age groups, it is recommended that F brands gradually adopt multi-brand strategy, different brands, corresponding to different consumer groups, so that the value proposition of different brands for different consumers can be more distinct and attractive.
4.4. Analysis of Age Distribution of Members in the Open Card Channel
F brand is one of the enterprises that relatively early use of Tmall, JD.com and other e-commerce platforms. We analyze the average age of members from different card opening channels to see if there are significant age differences among members across channels, which helps us to offer different products or market strategies for different channels. We first compared the average age of different channel by on the line and offline, and then made a comparative analysis of different online platforms.
1) Average age of online and offline members
Limit the age to 18 - 80. The features used in this analysis are “opening card counters”, “member age” and the new feature “opening card channels” according to the “open card counter”, divided into two types which are online and offline. See Figure 8.
2) Average age of members of four online channels
Limiting the age between 18 - 80, open card counters includes four e-commerce platforms: Tmall flagship store, JD.com flagship store, Micro Mall, Fashion List. The features used in this analysis are “open card counter” and “member age”. Analysis result is shown in Figure 9.
Business Insight VI: The age gap between different channels is obvious
From Figure 8 it is easy to see that the average age of registered members online is 4 years younger than offline. Online channel age is also slightly different, for example, Tmall members are 2 years younger than JD.com. Thus, product development and marketing planning, according to the differences in the channel should distinguish. While brands are now creating seamless offline and offline experiences, that doesn’t mean there’s no difference between online and offline customers.
Suggestion VIII: Conduct business based on accurate portraits of members from different channels
Figure 8. Average age members of online and offline channel.
Figure 9. Average age of members of different e-commerce channels.
This analysis is only an analysis of members of different channels through the age dimension. In fact, this is not enough for the analysis of members. Now both Tmall and JD.com can provide multi-dimensional and accurate portraits of members based on platform-wide data, which is very important for us to fully understand the differences between members in different channels. This also provides the basis for a data-driven, accurate commerce. We strongly recommend that F brand members of all platforms, both offline and online, perform better clustering and accurate portraits to provide accurate input for product development and marketing. This is one of the core elements of China’s current popular “New Retail” that Jack Ma advocated .
4.5. Analysis of the Relationship between Age and Consumption Preferences
The database has a “product label” with four characteristic values: Getting Started, Fashion, Classic, and Boutique. From the price point of view, the price of four kinds of products increased in turn. We did an analysis of consumer trends at different ages. We hope that through analysis, we will be able to discover the relationship between age and product labeling.
1) Consumption tendencies of different ages
Limit the age to 18 - 80, and filter out the “unknown” or null in the “product tags”. The features used are “member age”, “product tags”, and no new features are used. Analysis result is shown in Figure 10.
Analyzing the bar chart of different age groups, we found a major anomaly, with 30 years of age being a dividing line, with people under the age of 30, classic and boutique consumers are zero, and those over 30, the entry and fashion consumer population are zero. After data verification, we understand that the product tags are not recorded in accordance with actual consumption behavior, but rather define people under 30 as entry or fashion, and those over 30 years of age as classics or boutiques, which makes little sense in the analysis of the relationship between age and consumption preferences.
Suggestion IX: Integrate member data and transaction data to enhance the membership system
Fragmentation of membership system and transaction systems is very common among Chinese companies. We recommend being able to pass through member data and transaction data to determine a customer’s product preferences based on real purchase behavior. For example, it is automatically defined as “entry” if
Figure 10. Product preferences for different age groups.
most of the products purchased by the member are entry-level. In this way, we can accurately grasp each member’s product tags.
4.6. Analysis of the Relationship between Member’s Consumption Ability and Residents’ Disposable Income
In this part of the analysis, we hope to find out whether there is a correlation between the spending power (total consumption) of the members and the level of regional economic development.
We first analyzed the two dimensions of member consumption level and the disposable income of residents in each region, and we listed the ten cities with the highest level of member consumption to verify the correlation between consumption capacity and the level of regional economic development.
1) Analysis of the relationship between total consumption and disposable income of residents in the region
Filtering the region for china’s provinces and municipalities, removing unknowns, and using the 2018 per capita disposable income data to aid analysis . First calculated the average historical consumption amount of each province. This time, the feature is used as the basic feature of “member address” and “total historical consumption” and the new feature “province” is extracted according to the “member address” feature. See Figure 11.
2) The total consumption of members in different cities
The region is filtered into the scope of province, municipality, removing the unknown, this time using the feature “member address” and “total historical consumption” as the basic feature, according to the “address” feature separates the new feature “city”. Analysis result is shown in Figure 12.
Business Insight VII: There is no obvious correlation between F brand member’s spending power and regional development level
Figure 11. Scatter plots chart of disposable income and per capita consumption.
Figure 12. The average consumption per member of the top 10 cities.
As can be seen from Figure 11, there is no obvious correlation between the disposable income of residents and the consumption of members. Similarly, from the total consumption of members of different cities in the column chart, the top 10 cities are mostly China’s second- and third-tier cities. First-tier cities such as Beijing and Shanghai are ranked second. It also shows that the disposable income of the residents of the region is not a decisive factor in the amount of member consumption.
We believe that although the F brand is a well-known brand in China, but it is still a second-tier brands, the price is relatively low, therefore, the income level of residents is not the key factor in determining the purchase, which is very different from the first-tier luxury brands such as LV. In addition, the total amount spent by a local member has a lot to do with the length of time the brand enters the area and the local brand influence. This conclusion has not been rigorously tested by data, but from the interview with the F brand, we validated the conclusion.
Suggestion X: Geographical strategy should focus on the relative market share of the region
For the F brand, the level of regional economic development is not the most important factor in deciding whether to enter, we should pay more attention to, in any region, and we have the ability to become a market share leader. In addition, for the areas already covered, by market share analysis, F brand can focus on the core area, enlarge the market share, so as to effectively affect the total consumption of members
4.7. Analysis of Member Geographical Distribution
This section analyzes the proportion of each member of each province or municipality in China to the resident population of the region, and we define this ratio as the penetration rate of the brand in that area. Through penetration, we can find areas of our strength.
We put these data on the map of China’s administrative divisions, so that we can see the distribution of brand penetration in different regions, from which to find the opportunities, to guide the brand’s geographical expansion strategy.
1) Distribution of member penetration rate in each province
Filter the region to the scope of province and municipality, and remove the unknown. This time, the feature is used as “member address”, according to the “member address” features to separate the new feature “provinces”. We import external data “2018 provinces resident population”  to calculate the member penetration rate. The result is shown in Figure 13.
In Figure 13, the closer the color is to red, the higher the penetration rate of members, the closer the color is to blue, the lower the penetration rate of members.
Business Insight VIII: The dominant market has a trend of linkage, but there are still weak links
It is not difficult to see from Figure 13 that the Yangtze River basin is the core area of the brand, at present, in Sichuan (SC), Chongqing (ZQ), Hubei (HB), Jiangsu (JS), Shanghai (SH) and neighboring Zhejiang (ZJ) has a good performance. But in other regions of the Yangtze River basin, such as Hunan (HN), Jiangxi (JX) and Anhui (AH) have performed in general.
From the distribution of coastal provinces and cities, such as Jiangsu, Shanghai and Zhejiang in the Yangtze River Delta, Guangdong (GD) in the Great Gulf Area, Liaoning (LN) in the Bohai Bay, members distribution is relatively concentrated, forming a certain brand influence and membership base, other coastal areas such as Tianjin (TJ), Hebei (HE), Shandong (SD), Fujian (FJ), Guangxi (GX) membership base is relatively weak.
Figure 13. Distribution of penetration rates of members in each province.
Suggestion XI: Focus on two economic zones to achieve permeable regional expansion
We believe that the brand has the spillover effect of the adjacent area. We suggest that F brand can connect the Yangtze River basin market, and build the core market along the Yangtze River basin economic belt. That has important strategic significance. Therefore, we suggest that the next step of brand off-line expansion focus, can be in Hunan, Jiangxi and Anhui area, so the Yangtze River basin market link to one unique market.
If the coastal areas as another development focus area, offline channels can focus on Tianjin, Hebei, Shandong, Fujian and Guangxi and other regions. In this way, two belt markets centered on the coastal economic belt and the Yangtze River economic belt can be formed.
When two belt markets form a market position, penetration of adjacent markets can be carried out along two belt markets, such as Henan, Shanxi, Shaanxi, Guizhou and other places, and gradually achieve the full coverage of the Chinese market.
Suggestion XII: Greatly enhance the penetration of the dominant market
It is important to note that the expansion strategy for the two belt markets is purely based on the distribution of existing membership. We assume that the existing membership of the market means that the brand has a good local market base. However, because the target member penetration rate of the market, such as Hubei, its penetration rate is only eight per ten thousandth, which means that the penetration rate of the existing market is still very low. Therefore, F brand need greatly strengthen the member development system, greatly improve the penetration of the core market. In the next three to five years, F brand should improve the number of members more than ten times, in order to become an influential regional brand, which is a very difficult task, but its strategic importance is self-evident.
From the point of view of member management, F brand’s membership system is not complete; a lot of data is missing. But based on those incomplete data, through analysis, we can still produce eight important business insights that highlight the appeal of big data analytics. Based on this business insight, we have made 12 suggestions on business development and member management through our exchanges with industry experts. It is believed that these suggestions will be of great guiding significance to the development of F brand and the upgrading of its member management system.
We’ll continue to track this project next. We suggest that the enterprise integrate member management system and transaction system into one system, while using the big data from Tmall, JD.com and other platforms to analyze its member data and consumption data more effectively, in order to get more and deeper business insights. At the same time, we also expect to be able to use more advanced AI technology to effectively predict the consumption behavior of members, to better guide the development of the brand  .
I carried out this analytical research in the course of Application of Data Analysis and Machine Learning at Carnegie Mellon University, where I was carefully directed by Professor Pradeep Ravikumar; At the same time, China’s well-known women’s bag brand F company also provides membership data, and gave me many helps to form a series of important data-based findings and recommendations, and they also used those recommendations to guide their work; The CIS platform provides learning opportunities at Carnegie Mellon University and provides professional guidance on the writing of the thesis. Here, I would like to express my sincere thanks to them for their help.