Native Market Factors for Pricing Cryptocurrencies

The cryptocurrency market has been growing frantically in number of cryptocurrencies, online exchanges, and market capitalization, which has amplified the need for comprehensive and robust pricing models. Using a database of all eligible cryptocurrencies listed on the CoinMarketCap website, we study the relationship between returns and several potential pricing factors, such as size (market capitalization), momentum, liquidity, and maturity. The analysis was conducted from December 27, 2013, to December 29, 2020, using weekly data for 3'667 cryptocurrencies. Results point out that portfolios of cryptocurrencies with smaller market capitalization, higher reversal, lower liquidity, and lower maturity tend to offer higher returns. The 5-factor model that additionally includes illiquidity and maturity performs better than the 3-factor model previously proposed in the literature, meaning that illiquidity and maturity significantly help capture the cross-sectional cryptocurrency risk premia. The 5-factor model presented seems robust to different procedures to construct portfolios and factors.


IntroductIon
The growth of the cryptocurrency market in terms of number of cryptocurrencies, online exchanges, and market capitalization has attracted more effective and potential individual and institutional investors.Consequently, the demand for financial studies has also increased, resulting in an exponential growth in the empirical finance literature applied to cryptocurrencies.Until 2017, the attention was focused on a few major cryptocurrencies, such as Bitcoin, Ethereum, Litecoin, Tether and Ripple.More recent papers consider bigger samples formed by more cryptocurrencies and longer periods.
For traditional financial markets, namely the stock market, several studies have attempted to identify the main pricing factors.The Capital Asset Pricing Model (CAPM) that considers just one factor -the market portfolio -is the most simple and well-known of such models.On this topic, Fama (1970), Fama and French (1993), Carhart (1997), Fama and French (2012), and Fama and French (2015) are pivotal references in the related literature.In the cryptocurrency market, this analysis is only beginning with an additional difficulty as some of the factors designed for the stock market are not applicable.Shen et al. (2020) construct a 3-factor model for cryptocurrencies, which encompasses market, size, and momentum factors.Because the book-to-market factor does not apply to cryptocurrencies, the size factor has been constructed using size and momentum.More accurately, this last factor should be called reversal, as it seems that bad (good) past returns tend to be followed by good (bad) returns in the cryptocurrency market.Shahzad et al. (2020) elaborate on this model, adding a contagion factor.
This paper addresses the issue of what market intrinsic factors are priced in the cryptocurrencies market.The main objective of this research is twofold.First, analyze several market features that may drive the prices of cryptocurrencies.Second, use this information to derive a factor pricing model.
The principal data and methodological novelties that this study brings to the literature are the following: a) Handling a comprehensive dataset of cryptocurrencies, employing all the information in the CoinMarketCap website from April 30, 2013, to December 29, 2020.b) Consideration of several features of the cryptocurrencies' ecosystem, namely market return, size, momentum, and, most importantly, liquidity and maturity.c) Application of four different methodologies to construct the portfolios, namely, sequential and intersecting double-sort equally and value-weighted portfolios.d) Presentation of a 5-factor model that outperforms both the CAPM and the 3-factor model of Shen et al. (2020).
The remainder of this paper is organized as follows.Section 2 presents the arguments supporting additional factors in the pricing model and develop the additionally hypotheses contextualized in the literature.Section 3 explains the raw dataset, filtering procedures, and data aggregation.Section 4 presents the formulas used to compute the financial features of cryptocurrencies, and the methodology to construct the factors and portfolios used in the regressions' framework.Section 5 shows the main results and Section 6 performs some robustness checks.Section 7 concludes the paper.

LIterature revIew and hypothesIs deveLopment
An important strand of financial literature on cryptocurrencies focuses on the weak form of market efficiency, according to which the price system should contain all the relevant information on historical prices and other market-related variables, so that future prices cannot be predicted using past information.
More recently, other studies began testing the efficiency of other cryptocurrencies besides Bitcoin.Wei (2018) analyses 456 cryptocurrencies in 2017, when the value of the cryptocurrency market was skyrocketing.The author uses the Amihud illiquidity ratio (Amihud, 2002) to sort the cryptocurrencies into five groups and then applies the tests used in Urquhart (2016).Wei (2018) argues that, as more active and informed traders enter the market, liquidity increases while volatility decreases, creating fewer arbitrage opportunities, and hence, highly liquid cryptocurrencies tend to be more efficient.In the same line of thought, Brauneis and Mestel (2018) use 73 cryptocurrencies from August 31, 2015, to November 30, 2017, and conclude that as the liquidity of cryptocurrencies increases, they became less predictable and therefore more efficient.Al-Yahyaee et al. ( 2020) analyze six cryptocurrencies with the highest market capitalization during the period August 7, 2015, to July 3, 2018, showing that informational efficiency is directly linked to liquidity and that efficiency tends to increase as the market matures.
Several studies have tried to directly identify variables that have a significant relationship with the returns of cryptocurrencies, among these variables stand out size, momentum, trading volume, volatility, and maturity (Liu et al., 2022).Kyriazis & Prassa (2019) analyse 846 cryptocurrencies from April 1, 2018, to January 31, 2019, when the market capitalization of cryptocurrencies was decreasing.They argue that during downward market movements, cryptocurrencies with higher market capitalization are also the ones with higher liquidity.The reasoning is that during bearish periods, investors in most markets tend to prefer assets with higher market capitalization and lower volatility.Brauneis et al. (2020) conclude that liquidity of cryptocurrencies is mostly independent from other financial markets and depends mainly on intrinsic volatility and trading volume.Balcilar et al. (2017) show that trading volume can be used to predict Bitcoin returns but only when the market is performing around the median.Burggraf and Rudolf (2020), using data on 1'000 cryptocurrencies from April 28, 2013, to November 1, 2019, show that higher volatility produces higher returns.
In a nutshell, these studies indicate that volatility is higher in more illiquid and younger cryptocurrencies.As risk should be rewarded by the market, then we formulate the following hypotheses: H1: Illiquidity increases the returns of cryptocurrencies; hence the illiquidity factor may be measured by a portfolio formed by a long position in illiquid cryptocurrencies and a short position in liquid cryptocurrencies.
H2: Maturity decreases the returns of cryptocurrencies; hence the maturity factor may be measured by a portfolio formed by a long position in younger cryptocurrencies and a short position in older cryptocurrencies.

data and preLImInary anaLysIs
The dataset was retrieved from https://coinmarketcap.com, which is one of the most complete and reliable sources of information on cryptocurrencies.The legitimacy of this website derives from its use by many financial studies on cryptocurrencies.This website uses objective criteria according to which cryptocurrencies and online exchanges must comply to be listed. 1he sample covers the period from April 30, 2013, to December 29, 2020.The raw dataset is formed by 5'763 cryptocurrencies.For each cryptocurrency we retrieved the daily close prices, trading volume, and market capitalization, in USD, recorded at 00:00:00 UTC.According to CoinMarketCap, the close prices are volume-weighted index prices and daily volumes are the simple sum of the trading volume considering several listed online exchanges.
Given that we use the complete set of listed cryptocurrencies, it is important to mention that the data does not suffer from survival bias, as some cryptocurrencies did not reach the last day in the sample.The number of cryptocurrencies increased steadily from 7 on April 30, 2013, to 4'073 on December 29, 2020, but during the overall period covered, 5'763 cryptocurrencies were listed, hence 1'690 cryptocurrencies ceased to exist or were removed from the CoinMarketCap listing.This means that only around 70% survived until December 29, 2020.
The second step in preparing the dataset was filtering the raw data.This was conducted using three filter rules: (1) Trading volume is missing from April 30, 2013, to December 27, 2013.So, the sampling period begins in this last date.The period between these dates is only used to compute the maturity of cryptocurrencies.(2) Some cryptocurrencies had missing days, probably due to communication failures between the exchanges and the CoinMarketCap website.If a particular day was missing, the gap was fulfilled by linear interpolation.We proceeded in this way when there was a maximum of three days missing in a row.Larger gaps, mainly due to provisionally listing on the CoinMarketCap website, were treated as if the cryptocurrency was nonexistent during that period.(3) When a cryptocurrency was added to CoinMarketCap, usually the information on market capitalization for the first few days is not complete or has clear mistakes.These days were ignored for these cryptocurrencies until they had information on all variables of interest.
After applying these filters, we end up with 3'667 cryptocurrencies, 2'562 days, corresponding to 366 weeks.This daily database was then aggregated weekly, using Wednesdayto-Wednesday prices, volumes, and market capitalizations. 2omé Lima Helder Sebastião Native Market Factors For PriciNg cryPtocurreNcies 75

methodoLogy
This section explains the construction of the time series of returns and other features, namely size, illiquidity, momentum, and maturity, for each cryptocurrency.It explains the construction of portfolios and presents some preliminary results that point out how to construct the pricing factors.Finally, it presents the procedures used to compute the pricing factors and the factor models.

Returns and other features
Since cryptocurrencies are studied cross-sectionally in aggregated terms, i.e., using portfolios, we use discrete returns which are aggregable in the asset space.The close-to-close prices were used to compute the weekly returns of cryptocurrency i as: where P i,t and P i,t-7 represent the close price on Wednesday t and seven days before, respectively.The series of returns present massive extreme values, with some cryptocurrencies having returns over 10 4 .To winsorize the outliers but still maintain the main features of the data, namely volatility, we used an interquartile distance to identify and rescale outliers.We consider as an outlier any observation outside the interval of [p 25 -k(p 75 -p 25 ), p 75 + k(p 75 -p 25 )], where p 25 and p 75 are the 25th and 75th percentiles, respectively, and k is a multiplier factor.We tested several multipliers, k = 1.5, 3, 4.5, 6 and 7, and decided to use k = 6.Using this criterium, 99.81% and 89.96% of cryptocurrencies have less than 5% and 1% of outliers, respectively, which were rescaled to the limits of the above interval.
Size was simply proxied by the market capitalization.For the momentum we followed Shahzad et al. (2020) and Shen et al. (2020), which conclude that the best strategy, i.e., the one with the higher t-statistic, results from forming buy-sell portfolios based on the previous returns for a one-time holding period.This means constructing the portfolios at time t -7, based on the returns of the cryptocurrencies from t -14 to t -7, and holding it until Wednesday t, which translates into where, R i,t-7 is the weekly return of cryptocurrency i at t -7.Brauneis et al. (2021) explore high and low frequency data for Bitcoin and Ethereum, testing different liquidity measures, and concluding that one of the best measures to describe the liquidity of cryptocurrencies was the Amihud illiquidity ratio (Amihud, 2002).Hence, illiquidity was measured by this ratio, which assesses the price impact of 1USD of trading volume on the returns.Theoretically, the ratio ranges from 0 (most liquid) to +∞ (most illiquid).For a given cryptocurrency i, the illiquidity ratio was computed as: where R i,τ and V i,τ are the arithmetic return and the volume traded in USD at day τ, respectively.
For measuring the maturity of a cryptocurrency, we considered the number of weeks with valid data from its launching until day t.To compute this measure, we use all the data available since April 30, 2013.On this date only seven cryptocurrencies were listed, hence for all other cryptocurrencies, there is no measurement error.

Portfolios
We consider four features: size (market capitalization), momentum, measured by the previous weekly return, illiquidity, measured by the Amihud illiquidity ratio, and maturity, measured by the number of weeks since launching.These portfolios are constructed on t -7 and held until t.Table 1 enables a first glance at the importance of each feature and the way that portfolios should be combined to compute the pricing factors.The patterns in Table 1 suggest that portfolio returns increase inversely with size, momentum, liquidity, and maturity.The size and momentum effects are in accordance with the literature (see, for instance, Shahzad et al., 2020;Shen et al., 2020;Liu et al.,2022).The reported illiquidity and maturity effects support our hypotheses H1 and H2, respectively.
To form double-sorted portfolios of cryptocurrencies we use a sequential procedure.This procedure is as follows: (1) At each t -7, all cryptocurrencies are sorted based on the market capitalization (i.e., size) and are grouped into quintiles, (2) within each size quintile, cryptocurrencies are then sorted by the second feature and once again clustered into quintiles, (3) we then form value-weighted portfolios, using market capitalization as the weighting scheme, and compute their returns from t -7 to t, which are then used to compute the excess returns in relation to the risk-free rate (1-month US Treasury bill).Hence, according to each pair size/other feature we obtain 25 value-weighted portfolios.This approach is different from Fama and French (1993, 2012, 2015), that form 25 value-weighted portfolios by intersecting quintiles from a sort on size with the quintiles from an independent sort on the second feature.Our procedure produces portfolios with the same number of cryptocurrencies (except the last quintile portfolios which include the remaining cryptocurrencies, if the total number is not a multiple of 5), whilst the Fama-French approach gives portfolios with a variable number of cryptocurrencies.Another approach, such as the one used by Carhart (1997), is to construct equally weighted portfolios.
The weekly excess returns of these portfolios are presented in Table 2. Most portfolios excess returns are significant at the 1% level, and portfolios with cryptocurrencies of small, illiquid, with lower momentum (higher reversal) and lower maturity have higher excess returns.From all the different portfolios, it is quite visible that portfolios with smaller size offer higher excess returns.

Pricing factors and models
The pricing factors are built on the previous portfolios, conditional on the pair size/other feature.For the market factor, like in CAPM, we consider the value-weighted total market index (MKT) using all the cryptocurrencies in our filtered database as: where R it is the return and MarketCap it is the market capitalization of cryptocurrency i at the beginning of week t , and N is the number of cryptocurrencies.Since cryptocurrencies do not have a book-value, to construct the size factor, we follow the approach suggested by Shen et al. (2020) and use momentum as the second sort.From these two sorts, and similar to Fama and French (2015), we divide the size sort by percentile [0%, 10%] (Small) and percentile [90%, 100%] (Big), and the momentum sort by percentile [0%, 30%] (low momentum, denoted by Down), percentile ]30%, 70%[ (Medium momentum) and percentile [70%, 100%] (higher momentum, denoted by Up).Then we intersect the size and momentum partitions, creating six value-weighted portfolios, respectively, SD, SM, SU, BS, BM, and BU.
From the evidence presented in Table 1 and Table 2, Small portfolios offer higher returns than Higher portfolios, hence the size factor is defined as Small minus Big (SMB): For the remaining factors, we proceeded in the same way but dropping the medium interval on the second feature.Our factors, were, respectively, Down momentum minus Up momentum (DMU), Illiquid minus Liquid (IML), and Young minus Old (YMO).That is: With all the portfolios and factors constructed, we proceeded with the estimation of the factor models using Ordinary Least Square (OLS).
The first model only considers the market factor, similar to CAPM, with the market portfolio proxied by the value-weighted market index, MKT.
where R p,t , Rf t , and MKT t are the return of portfolio p, the risk-free interest rate, and the market return at time t, respectively.As in Shen et al. (2020), the 3-factor model is defined by: where SMB and DMU are respectively the size and momentum factors previously defined.
Our more encompassing model is a 5-factors model, defined as: where IML and YMO are the illiquidity and maturity factors, respectively.As in Fama and French (2012), we defined the Sharpe ratio as: where a is the column vector of the intercepts of the regressions and Ω is the covariance matrix of the error terms.

maIn empIrIcaL resuLts
Table 5 presents a summary of the average statistics for the CAPM, 3-factor, and 5-factor models.This table highlights that the 5-factor model improves on the CAPM and on the 3-factor model.The average absolute intercept decreases and the GRS statistic (Gibbons et al., 1989) on the null hypothesis that the intercepts are jointly equal to zero, although still significant at the 1% level, decrease substantially.The average standard error of the intercepts decreases and the adjusted R 2 increases.Notice that although the additional factors are important in explaining the returns of cryptocurrencies, the market factor is undoubtedly the most important one.Notes: This table presents the summary statistics from regressions on CAPM, 3-factor and 5-factor models.Each column corresponds to the average statistics for the regressions on sequential double-sort value-weighted portfolios.|a| is the average absolute intercept, R 2 is the average adjusted determination coefficient, s(a) is the average standard error of the intercepts.SR is the Sharpe ratio computed according to Equation ( 12).GRS is the statistics on the null hypothesis that the intercepts are jointly zero (Gibbons et al., 1989).The significance at the 1%, 5% and 10% is denoted by ***, **, *, respectively.Regressions were performed using 365 weekly observations, from January 1, 2014, to December 29, 2020.Source: Authors' own calculations.

robustness checks
The results presented in the previous section may be sensitive to the way that factors and portfolios are constructed, hence we conduct several robustness checks on the CAPM, 3-factor and 5-factor models.
Procedure 1 -The same sequential double-sort procedure but instead of using valueweighted portfolios when grouping the cryptocurrencies, we consider equally-weight portfolios.
Procedure 2 -For each pair size/another feature, portfolios are created using Fama and French (1993, 2012, 2015) procedure, that is, by intersecting the independent sort on size with an independent sort on another feature.From these intersections we formed both (2.1) value-weighted and (2.2) equally weighted portfolios.Table 6 shows the summary statistics of Procedure 1 and Procedure 2. Notes: This table presents the summary statistics for regressions on CAPM, 3-factor and 5-factor models considering different ways to construct the portfolios.Alternatives are the sequential double-sort but with equally weighted portfolios, the double-sort intersection value-weighted portfolios of Fama and French (1993, 2012, 2015), and the double-sort intersection but with equally weighted portfolios.Each column corresponds to the average statistics for the regressions.|a| is the average absolute intercept.R 2 is the average adjusted determination coefficient, s(a) is the average standard error of the intercepts.SR is the Sharpe ratio computed according to Equation ( 12).GRS is the statistics on the null hypothesis that all the intercepts for a set of regressions are jointly zero (Gibbons et al., 1989).The significance at the 1%, 5% and 10% is denoted by ***, **, *, respectively.Regressions were performed using 365 weekly observations, from January 1, 2014, to December 29, 2020.Source: Authors' own calculations.
Procedure 3 -On the previous factors we used the percentile [0%, 10%] as small size cryptocurrencies and the interval [90%, 100%] as big size cryptocurrencies.Here we use percentiles [0%, 50%] and ]50%, 100%], i.e., the median to divide the cryptocurrencies into Small and Big.The breakpoints on the second feature are the same as before using the intervals [0%, 30%], ]30%, 70% [ and [70%, 100%].Using these factors, we estimate the 3 models for the following portfolios: (3.1) sequential double-sort value-weighted, (3.2) sequential double-sort equally weighted, (3.3) double-sort intersection value-weighted, and (3.4) double-sort intersection equally weighted.Table 7 shows the summary statistics of Procedure 3.   contrary to Fama and French (2015) and Shen et al. (2020), who produce the value-weighted portfolios by intersecting two independent sorts, is a sequential double-sort procedure that produces portfolios with the same cardinality.However, our main results are not sensitive to the way that portfolios or even pricing factors are constructed.We were able to identify two additional pricing factors: illiquidity and maturity.Clearly the returns of cryptocurrencies are directly related to the evolution of the overall market, the most important pricing factor.However, there is compelling evidence that cryptocurrencies with lower market capitalization (small size), more illiquid, with higher reversals, and less mature present higher returns.
Our 5-factor pricing model considers the market portfolio, size (Small minus Big -SMB), momentum (Down minus Up -DMU), illiquidity (Illiquid minus Liquid -IML), and maturity (Young minus Old (YMO).The inclusion of illiquidity and maturity improves the results in relation to the 3-factor model of Shen et al. (2020).
We should highlight that we are only dealing with native factors of the cryptocurrency market, i.e., factors that use the information intrinsic to the market.Other external factors such as the investor's attention, proxied for instance by Google searches may be important as it seems to be the case for Bitcoin (see, for instance, Kristoufek, 2015, Dastgir et al., 2019, Anastasiou et al., 2021).

Table 1 :
Weekly mean returns of quintile portfolios This table presents the weekly mean returns of value-weighted quintile portfolios.Each week, all cryptocurrencies were sorted by a given feature (size, measured by market capitalization, momentum, measured by the previous weekly return, illiquidity, measured by the Amihud illiquidity ratio, and maturity, measured by the number of weeks since launching) and are partitioned into quintiles.Then, the value-weighted portfolio, where the weight of each cryptocurrency is given by its relative market capitalization, is computed for each quintile.The sample is from January 1, 2014, to December 29, 2020 (365 weeks).

Table 2 :
Average excess returns of sequential double sorted value-weighted portfolios In each week t -7, all active cryptocurrencies were sorted into quintiles by size (market capitalization) and then, within these quintes were sorted by a second feature.The excess returns of week t were computed using the yield-to-maturity of the 1-month US Treasury bills.Portfolios are updated on a weekly basis (there are 365 weekly observations, from January 1, 2014, to December 29, 2020).The last column is obtained by subtracting in each week the portfolios in quintiles 1 and 5. Line S-B is obtained in each column by subtracting the line Big from line Small.

Table 6 :
Robustness checks on the portfolio construction

Table 7 :
Robustness checks on the portfolio and factor constructions