<?xml version="1.0" encoding="utf8"?>
 <!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd"> <article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.0" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">IJCV</journal-id>
      <journal-title-group>
        <journal-title>International Journal of Coronaviruses</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2692-1537</issn>
      <publisher>
        <publisher-name>Open Access Pub</publisher-name>
        <publisher-loc>United States</publisher-loc>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">IJCV-24-5129</article-id>
      <article-id pub-id-type="doi">10.14302/issn.2692-1537.ijcv-24-5129</article-id>
      <article-categories>
        <subj-group>
          <subject>research-article</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>The Covid-19 Pandemic and the Patterns of Nature</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Gregory</surname>
            <given-names>Warr</given-names>
          </name>
          <xref ref-type="aff" rid="idm1842611004">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Les</surname>
            <given-names>Hatton</given-names>
          </name>
          <xref ref-type="aff" rid="idm1842610356">2</xref>
          <xref ref-type="aff" rid="idm1842610500">*</xref>
        </contrib>
      </contrib-group>
      <aff id="idm1842611004">
        <label>1</label>
        <addr-line>Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston SC 29425 USA. </addr-line>
      </aff>
      <aff id="idm1842610356">
        <label>2</label>
        <addr-line>Faculty of Science, Engineering and Computing, Kingston University, Kingston, UK</addr-line>
      </aff>
      <aff id="idm1842610500">
        <label>*</label>
        <addr-line>Corresponding Author </addr-line>
      </aff>
      <contrib-group>
        <contrib contrib-type="editor">
          <name>
            <surname>Sasho</surname>
            <given-names>Stoleski</given-names>
          </name>
          <xref ref-type="aff" rid="idm1842458364">1</xref>
        </contrib>
      </contrib-group>
      <aff id="idm1842458364">
        <label>1</label>
        <addr-line>Institute of Occupational Health of R. Macedonia, WHO CC and Ga2len CC.</addr-line>
      </aff>
      <author-notes>
        <corresp>
    
    Les Hatton, <addr-line>Faculty of Science, Engineering and Computing, Kingston University, Kingston, UK</addr-line>, <email>lesh@oakcomp.co.uk</email></corresp>
        <fn fn-type="conflict" id="idm1841608996">
          <p>The authors declare no competing interests.</p>
        </fn>
      </author-notes>
      <pub-date pub-type="epub" iso-8601-date="2024-07-30">
        <day>30</day>
        <month>07</month>
        <year>2024</year>
      </pub-date>
      <volume>5</volume>
      <issue>1</issue>
      <fpage>10</fpage>
      <lpage>17</lpage>
      <history>
        <date date-type="received">
          <day>22</day>
          <month>05</month>
          <year>2024</year>
        </date>
        <date date-type="accepted">
          <day>22</day>
          <month>06</month>
          <year>2024</year>
        </date>
        <date date-type="online">
          <day>30</day>
          <month>07</month>
          <year>2024</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© </copyright-statement>
        <copyright-year>2024</copyright-year>
        <copyright-holder>Gregory Warr, et al</copyright-holder>
        <license xlink:href="http://creativecommons.org/licenses/by/4.0/" xlink:type="simple">
          <license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
        </license>
      </permissions>
      <self-uri xlink:href="http://openaccesspub.org/ijcv/article/2146">This article is available from http://openaccesspub.org/ijcv/article/2146</self-uri>
      <abstract>
        <p>This  paper addresses broadly the impact that unprecedented levels of scientific discovery can have on the emergent global patterns that we observe in nature. An essentially ubiquitous pattern that is associated with large complex discrete                    systems is attributable to the Conservation of Hartley-Shannon Information (CoHSI). One of the manifestations of CoHSI in the realm of protein structure is a distinctive equilibrium distribution of protein lengths that is dominated by a power-law. Here we examine the manner in which the accelerated pace of novel protein discovery during the Covid-19 pandemic affected this distribution,                 showing that despite an initial disruption, nevertheless the equilibrium state was reestablished.</p>
      </abstract>
      <kwd-group>
        <kwd>SARS-CoV-2</kwd>
        <kwd>Conservation Principle</kwd>
        <kwd>Statistical Mechanics</kwd>
        <kwd>Hartley Shannon Information</kwd>
        <kwd>Power-Law</kwd>
        <kwd>Protein Length</kwd>
      </kwd-group>
      <counts>
        <fig-count count="4"/>
        <table-count count="0"/>
        <page-count count="8"/>
      </counts>
    </article-meta>
  </front>
  <body>
    <sec id="idm1842459300" sec-type="intro">
      <title>Introduction</title>
      <p>This paper uses a novel approach to study change in the rapidly evolving and globally accessible TrEMBL protein databases available at <xref ref-type="bibr" rid="ridm1842053924">1</xref>. The TrEMBL protein databases accumulate the sequenced proteins which result from the efforts of countless teams of researchers around the planet. After more than 25 years of growth, they are already extremely large and still growing rapidly with the most recent release at the time of writing being release 2024 02 dated 27-Mar2024 of UniProtKB/TrEMBL which contains 248,234,451 sequence entries, comprising 87,367,689,973 amino acids. The proteins vary in length from the shortest A0A0G2JLF7 HUMAN at just 7 amino acids all the way up to a staggering 45,354 amino acids in the longest currently known, A0A5A9P0L4 9TELE.                 Clearly these enormous numbers are effectively intractable in terms of identifying local patterns, or even phylogenetically shared patterns but the discipline of               statistical physics over the last 150 years from its origins in the kinetic theory of gases in the hands of the visionary physicist Ludwig Boltzmann, has produced techniques for dealing with such extraordinarily large numbers. Compared with the number of molecules in 1 cubic meter of gas at standard temperature and     pressure (2<italic>.</italic>68 × 10<sup>25</sup>), even the TrEMBL databases pale into insignificance. In spite of these vast numbers, the methodology of statistical physics comfortably handles them automatically aggregating any and all local mechanisms to go          directly to the equilibrium or most likely distribution. In the case of a gas, the velocities all asymptote to the Maxwell-Boltzmann distribution <xref ref-type="bibr" rid="ridm1842050396">2</xref>.</p>
      <p>By incorporating information theory into this methodology, <xref ref-type="bibr" rid="ridm1842059188">3</xref><xref ref-type="bibr" rid="ridm1842119876">4</xref> it was shown that <italic>all </italic>discrete                 systems (systems composed of countable pieces) sharing only the property that their individual pieces are distinguishable and requiring no other commonality, exhibit patterns dominated by a power-law. These patterns arise from a conservation principle, CoHSI or the Conservation of Hartley-Shannon Information. If we consider the most basic discrete system that consists of ordered strings (components) of coloured beads, then from CoHSI the theoretically predicted length distribution for any discrete system looks like <xref ref-type="fig" rid="idm1842015316">Figure 1</xref>. The presence of this distribution has been identified now in a wide variety of dissimilar discrete systems - lengths of words in texts, number of words in sentences in texts, in large collections of software irrespective of their language or functionality and most notably for the purposes of this paper, in proteins <xref ref-type="bibr" rid="ridm1842059188">3</xref>. The prior knowledge that such an equilibrium distribution of lengths is present in any substantial collection of proteins guides our novel approach allowing us to measure significant departures from that equilibrium state and understand the reasons for any such                departures and this we now do.</p>
      <p>Proteins are an exemplary discrete system, in that we can consider them as strings of amino acids, the length of a protein being measured by the total number of amino acids. The TrEMBL database<xref ref-type="bibr" rid="ridm1842053924">1</xref>                  represents essentially the totality of our knowledge of the structure and diversity of proteins. One might expect that the distribution of protein lengths would be shaped by natural selection acting on particular structure/function properties, but whereas individual proteins will be subject to natural selection in the normal way, the overall distribution of the properties of proteins results from the complex interactions of many processes; natural selection, genetic drift, random extinctions etc. CoHSI, because of its                      statistical mechanical framework <xref ref-type="bibr" rid="ridm1842059188">3</xref> aggregates all these mechanisms and predicts a scale-free                      equilibrium outcome that is simply the overwhelmingly most likely state. The theoretically predicted length distribution for any discrete system looks like <xref ref-type="fig" rid="idm1842015316">Figure 1</xref>.</p>
      <fig id="idm1842015316">
        <label>Figure 1.</label>
        <caption>
          <title> The predicted asymptotic probability distribution function for a set of strings (components) of coloured beads of various lengths with no other property than that the different colours are distinguishable. The distribution shows a sharp unimodal peak                  transitioning into an extremely precise power-law tail 3.</title>
        </caption>
        <graphic xlink:href="images/image1.jpg" mime-subtype="jpg"/>
      </fig>
      <p>Thus it was predicted (and borne out experimentally) that the length distributions of proteins would show the scale-free distributions implied by CoHSI <xref ref-type="bibr" rid="ridm1842059188">3</xref><xref ref-type="bibr" rid="ridm1842119876">4</xref><xref ref-type="bibr" rid="ridm1841905540">5</xref>. Prior to the Covid-19 pandemic we can clearly see the predicted CoHSI distribution in protein lengths, for example in the 2017 TrEMBL release 17-03 (we consider other releases of TrEMBL as this narrative develops) as <xref ref-type="fig" rid="idm1842020932">Figure 2</xref>a - <xref ref-type="fig" rid="idm1842020932">Figure 2</xref>b. The similarity between the CoHSI prediction in <xref ref-type="fig" rid="idm1842015316">Figure 1</xref> and the data of <xref ref-type="fig" rid="idm1842020932">Figure 2</xref>a is compelling both visually and statistically.</p>
      <p><xref ref-type="fig" rid="idm1842020932">Figure 2</xref>b is a cumulative complementary distribution function <xref ref-type="bibr" rid="ridm1841901724">6</xref>, a widely used noise-suppressing                display of the same data as <xref ref-type="fig" rid="idm1842020932">Figure 2</xref>a. The left hand (y-)axis is the number of proteins longer than the size</p>
      <p>shown on the x-axis. On the left hand side, it is flat corresponding to the sharp rise to the peak of <xref ref-type="fig" rid="idm1842015316">Figure 1</xref>. Reading off this plateau height on the y-axis gives the total number of proteins considered here, (just under 1 × 10<sup>8</sup>in release 17-03). As we move right and the length increases, fewer and fewer proteins are greater than this  length and the data becomes the classic straight line on a log-log scale indicating the presence of the predicted power-law. The mere presence of a straight line is only a necessary                 condition for a power-law. For greater statistical confidence, a sufficiency test must also be run <xref ref-type="bibr" rid="ridm1841878692">7</xref><xref ref-type="bibr" rid="ridm1841873796">8</xref>. Details of this are given in <xref ref-type="bibr" rid="ridm1842059188">3</xref> where an emphatic power-law is confirmed .</p>
      <fig id="idm1842020932">
        <label>Figure 2.</label>
        <caption>
          <title> The distribution of lengths of proteins measured in amino acids in TrEMBL release 17-03, A) The distribution as a probability distribution function and B) the distribution as a complementary cumulative distribution function.</title>
        </caption>
        <graphic xlink:href="images/image2.jpg" mime-subtype="jpg"/>
      </fig>
      <p>In essence the above development establishes <xref ref-type="fig" rid="idm1842020932">Figure 2</xref>b as an <italic>equilibrium distribution</italic>, an emergent                     property shared by all discrete systems <xref ref-type="bibr" rid="ridm1842119876">4</xref>.In the parlance of statistical mechanics, the mathematical   framework behind CoHSI, the biochemical properties of the individual amino acids in the global                   system of proteins are irrelevant; proteins can be considered as simply consisting of strings of                       distinguishable amino acids <xref ref-type="bibr" rid="ridm1842059188">3</xref>. This pattern of proteinlengths shown in 2b is by definition an                     equilibrium distribution and for such a large distribution, we would normally expect little to disturb thisequilibrium<bold>. </bold>However, there was extraordinary activity in protein discovery, focused on SARS-CoV-2 that took place in 2019 and the years following as a result of theCovid19 pandemic; in the following section we explore the impact of this on the equilibrium distribution of protein lengths.</p>
    </sec>
    <sec id="idm1842442844" sec-type="results">
      <title>Results and Discussion</title>
      <sec id="idm1842441692">
        <title>The TrEMBL databases 15-07 → 22-02</title>
        <p>Having established that there is an equilibrium distribution in protein lengths we can study different versions of TrEMBL as the database grew rapidly in the last few years. <xref ref-type="fig" rid="idm1842000892">Figure 3</xref> illustrates this by taking five releases 15-07, 17-03, 19-04, 21-03 and 22-02. We may first note that the system does indeed closely maintain the equilibrium distribution until the 21-03 distribution where a break suddenly                 appears in the region of protein lengths of 6,500 to 7,500 amino acids. Looking at this more closely, the break in <xref ref-type="fig" rid="idm1841997508">Figure 4</xref>a is due to an over-abundance (i.e. relative to the equilibrium distribution) of proteins with lengths of approximately 7000 amino acids. <xref ref-type="fig" rid="idm1841997508">Figure 4</xref>b shows that 12 months later the break was  already healed and the database resumed its natural growth trend around the equilibrium distribution as defined by CoHSI and exhibited in every other TrEMBL  release.</p>
        <fig id="idm1842000892">
          <label>Figure 3.</label>
          <caption>
            <title> Five recent releases of TrEMBL spanning the Covid-19 pandemic through 2022</title>
          </caption>
          <graphic xlink:href="images/image3.jpg" mime-subtype="jpg"/>
        </fig>
        <fig id="idm1841997508">
          <label>Figure 4.</label>
          <caption>
            <title> The distribution of lengths of proteins measured in amino acids in TrEMBL, A) Release 21-03 illustrating the clear departure from the equilibrium predicted by CoHSI and due to the uploading of                      considerable selective work on the SARS-COV-2 virus and B) Release 22-02 12 months later when the equilibrium was essentially restored.</title>
          </caption>
          <graphic xlink:href="images/image4.jpg" mime-subtype="jpg"/>
        </fig>
      </sec>
      <sec id="idm1842447236">
        <title>Covid-19 and the Equilibrium of Protein Lengths</title>
        <p>What happened between TrEMBL releases 21-03 and 22-02 to explain first why the CoHSI equilibrium was perturbed, and second how it was re-established? Although the only constraints in CoHSI theory <xref ref-type="bibr" rid="ridm1842059188">3</xref> are the total size of the system and its total information content, it is necessary that the system can be categorized in a consistent manner. In the case of the TrEMBL database,  consistency of categorization means that no redundancy in the sequence entries is  permitted. In other words, for each database entry there is a unique combination of two pieces of data. First, the species and in the case of viruses also the strain expressing the protein; and second, the exact number and sequence of amino acids in the protein. Only a single database entry with this combination of species (strain) and protein sequence is permitted and any other entries                    submitted to the database that are identical in their combination of these 2 properties are                        eliminated by curation. While such redundancy of protein entries is eliminated by the active curation</p>
        <p>of the database, this curation was relaxed early in the Covid-19 pandemic, when a special portal <ext-link xlink:href="https://www.ebi.ac.uk/training/events/uniprot-covid-19-website/-" ext-link-type="uri">https://www.ebi.ac.uk/training/events/uniprot-covid-19-website/-</ext-link> (now closed) was created by UniProt for the submission of SARS-CoV2 sequences.</p>
        <p>Tens of millions of SARS-CoV-2 protein sequences have been uploaded to the protein databases, and we note that the ORF1ab polyprotein of SARS-CoV-2 contains 7096 amino acids.</p>
        <p>We suggest that the massive uploading of presumptively redundant SARS-CoV-2 sequences resulted in the perturbation of the equilibrium seen in TrEMBL release 21-03. The resumption of normal curation of  the database would have eliminated redundancies created by this large volume of identical                       submissions of SARS-CoV-2 proteins, reestablishing the equilibrium as seen in TrEMBL release                    22-02.</p>
        <p>Thus while the CoHSI equilibrium as exemplified globally in protein lengths is remarkably stable, at the same time it is sensitive to the consistency of categorization as revealed by the unprecedented  number of presumably redundant SARS-CoV-2 sequences that were submitted to TrEMBL early in the Covid-19 pandemic.</p>
      </sec>
      <sec id="idm1842422068">
        <title>The Covid-19 Pandemic in Perspective</title>
        <p>While the Covid-19 pandemic perturbed the equilibrium of protein length distributions, as described above, this resulted from the unprecedented burst of research into the SARS-CoV-2 virus. However, many other aspects of the Covid-19 pandemic also show power-law behaviour, as would be expected from any large, complex discrete system and as predicted by CoHSI theory <xref ref-type="bibr" rid="ridm1842059188">3</xref><xref ref-type="bibr" rid="ridm1842119876">4</xref>. Examples of                      power-law distributions can be found early in the course of the pandemic as the infection spread                  essentially without control and before cases began to reach saturation. Blasius <xref ref-type="bibr" rid="ridm1841863692">9</xref> examined the relative size of outbreaks in countries that reported statistics and showed that both the number of infected                  people and the number of deaths displayed power-law distributions. Similar results showing a                        power-law distribution were reported for Covid-19 fatalities in European countries<xref ref-type="bibr" rid="ridm1841862468">10</xref>. Blasius <xref ref-type="bibr" rid="ridm1841863692">9</xref> also examined the statistics for SARS-CoV-2 infections and death reported by counties within the United States; these also showed power-law distributions. These results from the Covid-19 pandemic are not unique; the general presence of power-laws in epidemics of infectious diseases has been known for some time <xref ref-type="bibr" rid="ridm1841867508">11</xref><xref ref-type="bibr" rid="ridm1841855692">12</xref>; for example Rhodes and Anderson <xref ref-type="bibr" rid="ridm1841867508">11</xref><xref ref-type="bibr" rid="ridm1841853532">13</xref> showed that both the size and duration of measles epidemics were characterized by power-law distributions. It is worth pointing out that early in the response to the Covid-19 pandemic, when specific vaccines were first available, levels of                      immunization were highly unequal between countries. As would have been predicted, the number of individuals immunized within individual countries was also observed to follow a power-law                         distribution <xref ref-type="bibr" rid="ridm1842119876">4</xref>.</p>
      </sec>
    </sec>
    <sec id="idm1842423436" sec-type="conclusions">
      <title>Conclusions</title>
      <p>It is reasonable to ask why power-laws, as described  here in the impacts of the Covid-19 pandemic are essentially ubiquitous in the natural world. As reviewed in detail in <xref ref-type="bibr" rid="ridm1842119876">4</xref>, power-laws are observed in phenomena as diverse as wealth and the frequency of word use, from the size of computer programs to the quantity of alcohol consumed, and from the growth of oyster shells to the size of craters on the moon. Logically, there are only two possibilities. Either there is a single underlying principle that generates power-law behaviour in complex discrete systems, or there are many specific local mechanisms that coincidentally generate the same outcomes, i.e. power-law distributions, in very different systems. CoHSI theory <xref ref-type="bibr" rid="ridm1842059188">3</xref><xref ref-type="bibr" rid="ridm1842119876">4</xref> provides a resolution of these two explanations; complex discrete systems are mechanistically complicated (and often seemingly random) but regardless of any and all mechanisms, distributions dominated by power-laws are the essentially inevitable equilibrium state of these systems.</p>
    </sec>
    <sec id="idm1842421780">
      <title>Ethics</title>
      <p>No human subjects, human tissues or animals were used in this research.</p>
    </sec>
    <sec id="idm1842422860">
      <title>Data Accessibility</title>
      <p>This study adheres to the transparency and reproducibility principles espoused by <xref ref-type="bibr" rid="ridm1841848492">14</xref><xref ref-type="bibr" rid="ridm1841846476">15</xref><xref ref-type="bibr" rid="ridm1841844460">16</xref><xref ref-type="bibr" rid="ridm1841857276">17</xref><xref ref-type="bibr" rid="ridm1841829444">18</xref><xref ref-type="bibr" rid="ridm1841825844">19</xref> and includes references to all methods and source code necessary to reproduce the results                              presented. For this study, the methods and source code are included in the wider set of <italic>reproducibility deliverables</italic>, available at https:// datadryad.org/stash share/9nVGYwauP_wdFM84hA6GlC52t4pFircx4NAGl_ukbYA. Each reproducibility deliverable     allows all results, tables and diagrams to be reproduced individually for that study, as well as                              performing verification checks on machine environment, availability of essential open-source                     packages, quality of arithmetic and regression testing of the outputs <xref ref-type="bibr" rid="ridm1841822820">20</xref>.</p>
    </sec>
    <sec id="idm1842421420" sec-type="supplementary-material">
      <title>Supplementary Materials</title>
      <p>No supplementary materials directly accompany this paper.</p>
    </sec>
    <sec id="idm1842423652">
      <title>Authors’ Contributions</title>
      <p>LH performed the analyses, LH and GW developed the arguments, discussed the results and                          contributed to the text of the manuscript. Both authors gave final approval for publication.</p>
    </sec>
    <sec id="idm1842423148">
      <title>Funding</title>
      <p>No institutional or external funding for this research was received    by the authors.</p>
    </sec>
    <sec id="idm1842421060">
      <title>Acknowledgments</title>
      <p>The authors acknowledge the many researchers with whom they have discussed the implications of CoHSI in biological systems over the years, most notably the late Professor Bob Chapman who gave freely of his insights and vast experience.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ridm1842053924">
        <label>1.</label>
        <mixed-citation xlink:type="simple" publication-type="journal"><article-title>The Uniprot Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Research</article-title><date><year>2022</year></date>
11;51(D1):D523–D531. Available from: https://doi.org/10.1093/nar/gkac1052



</mixed-citation>
      </ref>
      <ref id="ridm1842050396">
        <label>2.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Sommerfeld</surname>
            <given-names>A</given-names>
          </name>
          <article-title>Thermodynamics and Statistical Mechanics</article-title>
          <date>
            <year>1956</year>
          </date>
          <publisher-name>Academic Press</publisher-name>
          <publisher-loc>New York NY;</publisher-loc>
        </mixed-citation>
      </ref>
      <ref id="ridm1842059188">
        <label>3.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Warr</surname>
            <given-names>G</given-names>
          </name>
          <article-title>Strong evidence of an information theoretical conservation principle linking all discrete systems. RSoc open sci</article-title>
          <date>
            <year>2019</year>
          </date>
          <fpage>11</fpage>
          <lpage>6</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1842119876">
        <label>4.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Warr</surname>
            <given-names>G W</given-names>
          </name>
          <article-title>Exposing Nature’s Bias: The Hidden Clockwork behind Society, Life and the Universe. Bluespear Publishing</article-title>
          <date>
            <year>2022</year>
          </date>
          <source>Isbn</source>
          <fpage>978</fpage>
          <lpage>1</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841905540">
        <label>5.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Warr</surname>
            <given-names>G</given-names>
          </name>
          <article-title>Protein Structure and Evolution: Are They Constrained Globally by a Principle Derived from Information Theory? PLOS ONE</article-title>
          <date>
            <year>2015</year>
          </date>
          <fpage>10</fpage>
          <lpage>0125663</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841901724">
        <label>6.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>MEJ</surname>
            <given-names>Newman</given-names>
          </name>
          <article-title>Power laws, Pareto distributions and Zipf’s law. Contemporary Physics</article-title>
          <date>
            <year>2006</year>
          </date>
          <volume>46</volume>
          <fpage>323</fpage>
          <lpage>351</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841878692">
        <label>7.</label>
        <mixed-citation xlink:type="simple" publication-type="journal"><name><surname>Clauset</surname><given-names>A</given-names></name><name><surname>Shalizi</surname><given-names>C R</given-names></name><name><surname>MEJ</surname><given-names>Newman</given-names></name><article-title>Power-Law Distributions in Empirical Data</article-title><date><year>2009</year></date><source>SIAM Review</source><volume>51</volume><issue>4</issue><fpage>661</fpage><lpage>703</lpage>
Available from: https: //doi.org/10.1137/070710111



</mixed-citation>
      </ref>
      <ref id="ridm1841873796">
        <label>8.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Clauset</surname>
            <given-names>A</given-names>
          </name>
          <article-title>Inference, Models and Simulation for Complex Systems</article-title>
          <date>
            <year>2011</year>
          </date>
          <source>Lectures</source>
          <volume>available</volume>
          <fpage>7000</fpage>
          <lpage>7000</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841863692">
        <label>9.</label>
        <mixed-citation xlink:type="simple" publication-type="journal"><name><surname>Blasius</surname><given-names>B</given-names></name><article-title>Power-law distribution in the number of confirmed COVID-19 cases</article-title><date><year>2020</year></date><source>Chaos</source><volume>30</volume><fpage>093123</fpage>
Available from: https://pubmed.ncbi. nlm.nih.gov/33003939



</mixed-citation>
      </ref>
      <ref id="ridm1841862468">
        <label>10.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Xenikos</surname>
            <given-names>D G</given-names>
          </name>
          <name>
            <surname>Asimakopoulos</surname>
            <given-names>A</given-names>
          </name>
          <article-title>Power-law growth of the COVID-19 fatality incidents in Europe. Infectious Disease Modelling</article-title>
          <date>
            <year>2021</year>
          </date>
          <volume>6</volume>
          <fpage>743</fpage>
          <lpage>750</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841867508">
        <label>11.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Rhodes</surname>
            <given-names>C J</given-names>
          </name>
          <name>
            <surname>Anderson</surname>
            <given-names>R M</given-names>
          </name>
          <article-title>Power laws governing epidemics in isolated populations</article-title>
          <date>
            <year>1996</year>
          </date>
          <source>Nature</source>
          <fpage>381</fpage>
          <lpage>600</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841855692">
        <label>12.</label>
        <mixed-citation xlink:type="simple" publication-type="journal"><name><surname>Meyer</surname><given-names>S</given-names></name><name><surname>Held</surname><given-names>L</given-names></name><article-title>Power-law models for infectious disease spread. The Annals of Applied Statistics</article-title><date><year>2014</year></date><volume>8</volume><issue>3</issue><fpage>1612</fpage><lpage>1639</lpage>
Available from: https://doi.org/10.1214/14-AOAS743



</mixed-citation>
      </ref>
      <ref id="ridm1841853532">
        <label>13.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Rhodes</surname>
            <given-names>C J</given-names>
          </name>
          <name>
            <surname>Anderson</surname>
            <given-names>R M</given-names>
          </name>
          <article-title>A scaling analysis of measles epidemics in a small population</article-title>
          <date>
            <year>1996</year>
          </date>
          <source>Philosophical Transactions of the Royal Society of London Series B: Biological Sciences</source>
          <fpage>10</fpage>
          <lpage>1098</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841848492">
        <label>14.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Popper</surname>
            <given-names>K</given-names>
          </name>
          <article-title>The Logic of Scientific Discovery</article-title>
          <date>
            <year>1959</year>
          </date>
          <publisher-name>Routledge</publisher-name>
        </mixed-citation>
      </ref>
      <ref id="ridm1841846476">
        <label>15.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Ziolkowski</surname>
            <given-names>A M</given-names>
          </name>
          <article-title>Further Thoughts on Popperian Geophysics–the Example of Deconvolution. Geophysical Prospecting</article-title>
          <date>
            <year>1982</year>
          </date>
          <fpage>10</fpage>
          <lpage>1371</lpage>
          <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003285</pub-id>
        </mixed-citation>
      </ref>
      <ref id="ridm1841844460">
        <label>16.</label>
        <mixed-citation xlink:type="simple" publication-type="book">
          <name>
            <surname>Claerbout</surname>
            <given-names>J F</given-names>
          </name>
          <name>
            <surname>Karrenbach</surname>
            <given-names>M</given-names>
          </name>
          <article-title>Electronic documents give reproducibility a new meaning. In:</article-title>
          <date>
            <year>1992</year>
          </date>
          <chapter-title>Proc. 62nd Ann. Int. Meeting. Soc. of Exploration Geophysics</chapter-title>
          <fpage>601</fpage>
          <lpage>604</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841857276">
        <label>17.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Roberts</surname>
            <given-names>A</given-names>
          </name>
          <article-title>How accurate is scientific software?</article-title>
          <date>
            <year>1994</year>
          </date>
          <source>IEEE Transactions on Software Engineering</source>
          <fpage>785</fpage>
          <lpage>797</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841829444">
        <label>18.</label>
        <mixed-citation xlink:type="simple" publication-type="book">
          <name>
            <surname>Shahram</surname>
            <given-names>M</given-names>
          </name>
          <name>
            <surname>Stodden</surname>
            <given-names>V</given-names>
          </name>
          <name>
            <surname>Donoho</surname>
            <given-names>D L</given-names>
          </name>
          <name>
            <surname>Maleki</surname>
            <given-names>A</given-names>
          </name>
          <name>
            <surname>Rahman</surname>
            <given-names>I</given-names>
          </name>
          <date>
            <year>2009</year>
          </date>
          <chapter-title>Reproducible Research in Computational Harmonic Analysis. Computing in Science &amp; Engineering</chapter-title>
          <fpage>11</fpage>
          <lpage>01</lpage>
        </mixed-citation>
      </ref>
      <ref id="ridm1841825844">
        <label>19.</label>
        <mixed-citation xlink:type="simple" publication-type="journal">
          <name>
            <surname>Ince</surname>
            <given-names>D C</given-names>
          </name>
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Graham-Cumming</surname>
            <given-names>J</given-names>
          </name>
          <article-title>The case for open program code</article-title>
          <date>
            <year>2012</year>
          </date>
          <source>Nature</source>
          <volume>482</volume>
          <fpage>485</fpage>
          <lpage>488</lpage>
          <pub-id pub-id-type="doi">10.1038/nature10836</pub-id>
        </mixed-citation>
      </ref>
      <ref id="ridm1841822820">
        <label>20.</label>
        <mixed-citation xlink:type="simple" publication-type="book">
          <name>
            <surname>Hatton</surname>
            <given-names>L</given-names>
          </name>
          <name>
            <surname>Warr</surname>
            <given-names>G</given-names>
          </name>
          <date>
            <year>2016</year>
          </date>
          <chapter-title>Full Computational Reproducibility in Biological Science: Methods, Software and a Case Study in Protein Biology. ArXiv.; Available from:http://arxiv.org/abs/1608.06897[q-bio.QM</chapter-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>
