Modeling bursts and heavy tails in human dynamics

Current models of human dynamics, used from risk assessment to communications, assume that human actions are randomly distributed in time and thus well approximated by Poisson processes. We provide direct evidence that for five human activity patterns the timing of individual human actions follow non-Poisson statistics, characterized by bursts of rapidly occurring events separated by long periods of inactivity. We show that the bursty nature of human behavior is a consequence of a decision based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, most tasks being rapidly executed, while a few experiencing very long waiting times. We discuss two queueing models that capture human activity. The first model assumes that there are no limitations on the number of tasks an individual can hadle at any time, predicting that the waiting time of the individual tasks follow a heavy tailed distribution with exponent alpha=3/2. The second model imposes limitations on the queue length, resulting in alpha=1. We provide empirical evidence supporting the relevance of these two models to human activity patterns. Finally, we discuss possible extension of the proposed queueing models and outline some future challenges in exploring the statistical mechanisms of human dynamics.


I. INTRODUCTION
Humans participate on a daily basis in a large number of distinct activities, from electronic communication, such as sending emails or browsing the web, to initiating financial transactions or engaging in entertainment and sports. Given the number of factors that determine the timing of each action, ranging from work and sleep patterns to resource availability, it appears impossible to seek regularities in the apparently random human activity patterns, apart from the obvious daily and seasonal periodicities. Therefore, in contrast with the accurate predictive tools common in physical sciences, forecasting human and social patterns remains a difficult and often elusive goal. Yet, the need to understand the timing of human actions is increasingly important. Indeed, uncovering the laws governing human dynamics in a quantitative manner is of major scientific interest, requiring us to address the factors that determine the timing of human actions. But these questions are driven by applications as well: most human actions have a strong impact on resource allocation, from phone line availability and bandwidth allocation in the case of Internet or Web use, all the way to the design of physical space for retail or service oriented institutions. Despite these fundamental and practical driving forces, our understanding of the timing of human initiated actions is rather limited at present.
To be sure, the interest in addressing the timing of events in human dynamics is not new: it has a long history in the mathematical literature, leading to the development of some of the key concepts in probability theory [1], and has reemerged at the beginning of the 20 th century as the design problems surrounding the phone system required a quantitative understanding of the call patterns of individuals. But most current models of human activity assume that human actions are performed at constant rate, meaning that a user has a fixed probability to engage in a specific action within a given time interval. These models approximate the timing of human actions with a Poisson process, in which the time interval between two consecutive actions by the same individual, called the waiting or inter-event time, follows an exponential distribution [2]. Poisson processes are at the heart of the celebrated Erlang formula [3], predicting the number of phone lines required in an institution, and they represent the basic approximation in the design of most currently used Internet protocols and routers [4]. Yet, the availability of large datasets recording selected human activity patterns increasingly question the validity of the Poisson approximation. Indeed, an increasing number of recent measurements indicate that the timing of many human actions systematically deviate from the Poisson prediction, the waiting or inter-event times being better approximated by a heavy tailed or Pareto distribution [5,6,7,8]. The difference between a Poisson and a heavy tailed behavior is striking: the exponential decay of a Poisson distribution forces the consecutive events to follow each other at relatively regular time intervals and forbids very long waiting times. In contrast, the slowly decaying heavy tailed processes allow for very long periods of inactivity that separate bursts of intensive activity.
We have recently proposed that the bursty nature of human dynamics is a consequence of a queuing process driven by hu-man decision making [5]: whenever an individual is presented with multiple tasks and chooses among them based on some perceived priority parameter, the waiting time of the various tasks will be Pareto distributed. In contrast, first-come-firstserve and random task execution, common in most service oriented or computer driven environments, lead to a uniform Poisson-like dynamics. Yet, this work has generated just as many questions as it resolved. What are the different classes of processes that are relevant for human dynamics? What determines the scaling exponents? Do we have discrete universality classes (and if so how many) as in critical phenomena [9], or the exponents characterizing the heavy tails can take up arbitrary values, as it is the case in network theory [10,11,12]? Is human dynamics always heavy tailed?
In this paper we aim to address some of these questions by studying the different universality classes that can appear as a result of the queuing of human activities. We first review, in Sect. II, the frequently used Poisson approximation, which predicts an exponential distribution of interevent times. In Sect. III we present evidence that the interevent time probability density function (pdf) P (τ ) of many human activities is characterized by the power law tail In Sect. IV we discuss the general characteristics of the queueing models that govern how humans time their various activities. In Sects. V-VI we study two classes of queuing models designed to capture human activity patterns. We find that restrictions on the queue length play an important role in determining the scaling of the queuing process, allowing us to document the existence of two distinct universality classes, one characterized by α = 3/2 (Sect. V) and the other by α = 1 (Sect. VI). In Sect. VII we discuss the relationship between interevent and waiting times. Finally, in Sec. VIII we discuss the applicability of these models to explain the empirical data, as well as outline future challenges in modeling human dynamics.

II. POISSON PROCESSES
Consider an activity performed with some regularity, such as sending emails, placing phone calls, visiting a library, or browsing the web. We can keep track of this activity by recording the timing of each event, for example the time each email is sent by an individual. The time between two consecutive events we call the interevent time for the monitored activity and will be denoted by τ . Given that the interevent time can be explicitly measured for selected activities, it serves as a test of our ability to understand and model human dynamics: proper models should be able to capture its statistical properties.
The most primitive model of human activity would assume that human actions are fundamentally periodic, with a period determined by the daily sleep patterns. Yet, while certain periodicity is certainly present, the timing of most human actions are highly stochastic. Indeed, periodic models are hopeless in capturing the time we check out a book from the library, beyond telling us that it should be within the library's operation hours. The first and still most widely used stochastic model of human activity assumes that the tasks are executed independently from each other at a constant rate λ, so that the time resolved activity of an individual is well approximated by a Poisson process [2]. In this case the probability density function (pdf) of the recorded interevent times has the exponential form In practice this means that the predicted activity pattern, while stochastic, will display some regularity in time, events following each other on average at τ ≈ τ = 1/λ intervals. Indeed, given that for a Poisson process σ = τ 2 − τ 2 = τ is finite, very long waiting times (i.e. large temporal gaps in the sequence of events) are exponentially rare. This is illustrated in Fig. 1a, where we show a sequence of events generated by a Poisson process, appearing uniformly distributed in time (but not periodic).
The Poisson process was originally introduced by Poisson in his major work applying probability concepts to the administration of justice [13]. Today it is widely used to quantify the consequences of human actions, such as modeling traffic flow patterns or accident frequencies [2], and is commercially used in call center staffing [14], inventory control [15], or to estimate the number of congestion caused blocked calls in mobile communications [4]. It has been established as a basic model of human activity patterns at a time when data collection capabilities on human behavior were rather limited. In the past few years, however, thanks to detailed computer based data collection methods, there is increasing evidence that the Poisson approximation fails to capture the timing of many human actions.

III. EMPIRICAL RESULTS
Evidence that non-Poisson activity patterns characterize human activity has first emerged in computer communications, where the timing of many human driven events is automatically recorded. For example, measurements capturing the distribution of the time differences between consecutive instant messages sent by individuals during online chats [16] have found evidence of heavy tailed statistics. Professional tasks, such as the timing of job submissions on a supercomputer [17], directory listings and file transfers (FTP requests) initiated by individual users [18] were also reported to display non-Poisson features. Similar patterns emerge in economic transactions [19,20], in the number of hourly trades in a given security [21] or the time interval distribution between individual trades in currency futures [22]. Finally, heavy tailed distributions characterize entertainment related events, such as the time intervals between consecutive online games played by users [23]. Note, however, that while these datasets provide clear evidence for non-Poisson human activity patterns, most of them do not resolve individual human behavior, but capture only the aggregated behavior of a large number of users. For example, the dataset recording the timing of the job submissions looks at the timing of all jobs submitted to a computer, by any user. Thus for these measurements the interevent time does not characterize a single user but rather a population of users. Given the extensive evidence that the activity distribution of the individuals in a population is heavy tailed, these measurements have difficulty capturing the origin of the observed heavy tailed patterns. For example, while most people send only a few emails per day, a few send a very large number on a daily basis [24,25].
If the activity pattern of a large number of users is simultaneously captured, it is not clear where the observed heavy tails come from: are they rooted in the activity of a single individual, or rather in the heavy tailed distribution of user activities? Therefore, when it comes to our quest to understand human dynamics, datasets that capture the long term activity pattern of a single individual are of particular value. To our The distribution of the exponents (α) characterizing the interevent time distribution of users browsing the web portal (e), individual loans from the library (f) and the emails sent by different individuals (g). The exponent α was determined only for users whose total activity levels exceeded certain thresholds, the values used being 15 web visits (e), 15 emails (f) and 10 books (g). (h,l) We numerically generate for 10,000 individuals interevent time distributions following a power-law with exponent α = 1. The distribution of the measured exponents follows a normal distribution similar to the distribution observed in (e-g). If we double the time window of the simulation (h) the deviation around the average becomes much smaller (l). (i-k) The distribution of the number of events in the studied systems: number of HTML hits for each user (i), the number of books checked out by each user (j) and the number of emails sent by different individuals (k), indicating that the overall activity patterns of individuals is also heavy tailed.
best knowledge only three papers have taken this approach, capturing the timing of printing jobs submitted by users [26], the email activity patterns of individual email users [5,24] and the browsing pattern of users visiting a major web portal [7]. These measurements offer direct evidence that the heavy tailed activity patterns emerge at the level of a single individual, and are not a consequence of the heterogeneous distribution of user activity. Despite this evidence, a number of questions remain unresolved: Is there a single scaling exponent characterizing all users, or rather each user has its own exponent? What is the range of these exponents? Next we aim to address these questions through the study of six datasets, each capturing individual human activity patterns of different nature. First we describe the datasets and the collection methods, followed by a quantitative characterization of the observed human activity patterns. Web browsing: Automatically assigned cookies allow us to reconstruct the browsing history of approximately 250,000 unique visitors of the largest Hungarian news and entertainment portal (origo.hu), which provides online news and magazines, community pages, software downloads, free email and search engine, capturing 40% of all internal Web traffic in Hungary [7,27]. The portal receives 6,500,000 HTML hits on a typical workday. We used the log files of the portal to collect the visitation pattern of each visitor between 11/08/02 and 12/08/02, recording with second resolution the timing of each download by each visitor [7]. The interevent time, τ , was defined as the time interval between consecutive page downloads (clicks) by the same visitor.
Email activity patterns: This dataset contains the email exchange between individuals in a university environment, capturing the sender, recipient and the time of each email sent during a three and six month period by 3,188 [24] and 9,665 [25] users, respectively. We focused here on the data collected by Eckmann [24], which records 129,135 emails with second resolution. The interevent time corresponds to the time between two consecutive emails sent by the same user.
Library loans: The data contains the time with second resolution at which books or periodicals were checked out from the library by the faculty at University of Notre Dame during a three year period. The number of unique individuals in the dataset is 2,247, together participating in a total of 48,409 transactions. The interevent time corresponds to the time difference between consecutive books or periodicals checked out by the same patron.
Trade transactions: A dataset recording all transactions (buy/sell) initiated by a stock broker at a Central European bank between 06/99 and 5/03 helps us quantify the professional activity of a single individual, giving a glimpse on the human activity patterns driving economic phenomena. In a typical day the first transactions start at 7AM and end at 7PM and the average number of transactions initiated by the dealer in one day is around 10, resulting in a total of 54,374 transactions. The interevent time represents the time between two consecutive transactions by the broker. The gap between the last transaction at the end of one day and the first transaction at the beginning of the next trading day was ignored.
The correspondence patterns of Einstein, Darwin and Freud: We start from a record containing the sender, recipient and the date of each letter [28,29,30]  were not dated or were assigned only potential time intervals spanning days or months. We discarded these letters from the dataset. Furthermore, the dataset is naturally incomplete, as not all letters written or received by these scientists were preserved. Yet, assuming that letters are lost at a uniform rate, they should not affect our main findings. For these three datasets we do not focus on the interevent times, but rather the response or waiting times τ w . The waiting time, τ w , represents the time interval between the date of a letter received from a given person, and the date of the next letter from Darwin, Einstein or Freud to him or her, i.e. the time the letter waited on their desk before a response is being sent. To analyze Einstein, Darwin, and Freud's response time we have followed the following procedure: if individual A sent a letter to Einstein on DATE1, we search for the next letter from Einstein to individual A, sent on DATE2, the response time representing the time difference τ = DATE2 − DATE1, expressed in days. If there are multiple letters from Einstein to the recipient, we always consider the first letter as the response, and discard the later ones. Missing letters could increase the response time, the magnitude of this effect depending on the overall frequency of communication between the respective correspondence partners. Yet, if the response time follows a distribution with an exponential tail, then randomly distributed missing letters would not generate a power law waiting time: they would only shift shift the exponential waiting times to longer average values. Thus the observed power law cannot be attributed to data incompleteness.
In the following we will break our discussion in three subsections, each focusing on a specific class of behavior observed in the studied individual activity patterns.
A. The α = 1 universality class: Web browsing, email, and library datasets In Fig. 2a-c we show the interevent time distribution between consecutive events for a single individual for the first four studied databases: Web browsing, email, and library visitation. For these datasets we find that the interevent time distribution has a power-law tail with exponent α ≈ 1, independent of the nature of the activity. Given that for these activity patterns we collected data for thousands of users, we need to calculate the distribution of the exponent α determined separatelly for each user whose activity level exceeds a certain threshold (i.e. avoiding users that have too few events to allow a meaningful determination of P (τ )). As Fig. 2e-g shows, we find that the distribution of the exponents is peaked around α = 1.
The scattering around α = 1 in the measured exponents could have two different origins. First, it is possible that each user is characterized by a different scaling exponent α. Second, each user could have the same exponent α = 1, but given the fact that the available dataset captures only a finite time interval from one month to several months, with at best a few thousand events in this interval, there are uncertainties in our ability to determine numerically the exponent α. To demonstrate that such data incompleteness could indeed explain the observed scattering, in Figs. 2h and 2l we show the result of a numerical experiment, in which we generated 10,000 time series, corresponding to 10,000 independent users, the interevent time of the events for each user being taken from the same distribution P (τ ) ∼ τ −1 . The total length in time of each time series was chosen to be 1, 000, 000. We then used the automatic fitting algorithm employed earlier to measure the exponents in Figs. 2e-g to determine numerically the exponent α for each user. In principle for each user we should observe the same exponent α = 1, given that the datasets were generated in an identical fashion. In practice, however, due to the finite length of the data, each numerically determined exponent is slightly different, resulting in the histogram shown in Fig. 2h. As the figure shows, even in this well controlled situation we observe a scattering in the measured exponents, obtaining a distribution similar to the one seen in Figs. 2eg. The longer the time series, the sharper the distribution is ( Fig. 2l), given that the exponent α can be determined more accurately.
The distributions obtained for the three studied datasets are not as well controlled as the one used in our simulation: while the length of the observation period is the same for each user, the activity level of the users differs widely. Indeed, as we show in Fig. 2i-k, the activity distribution of the different users, representing the number of events recorded for each user, also spans several orders of magnitude, following a fat tailed distribution. Thus the degree of scattering of the measured exponent α is expected to be more significant than seen in Fig 2h and l, since we can determine the exponent accurately only for very active users, for which we have a significant number of datapoints. Therefore, the obtained results are consistent with the hypothesis that each user is characterized by a scaling exponent in the vicinity of α = 1, the difference in the numerically measured exponent values being likely rooted in the finite number of events we record for each user in the datasets. This conclusion will be eventually corroborated by our modeling efforts, that indicate that the exponents characterizing human behavior take up discrete values, one of which, provide the empirically observed α = 1.
As we will see in the following sections, an important measure of the human activity patterns is the waiting time, τ w , representing the amount of time a task waits on an individual's priority list before being executed. For the email dataset, given that we know when a user receives an email from another user and the time it sends the next email back to her, we can determine the email's waiting or response time. Therefore, we define the waiting time as the difference between the time user A receives an email from user B, and the time A sends an email to user B. In looking at this quantity we should  be aware of the fact that not all emails A sends to B are direct responses to emails received from B, thus there are some false positives in the data that could be filtered out only by reading the text of each email (which is not possible in the available datasets). We have measured the empirically obtained waiting time distribution in the email dataset, finding that the distribution of the response times indeed follows a power law with exponent α = 1 (Fig. 3a).

B. The α = 3/2 universality class: The correspondence of Einstein, Darwin and Freud
In the case of the correspondence patterns of Einstein and Darwin we will focus on the response time of the authors, partly because we will see later that this has the most importance from the modeling perspective. As shown in Fig. 4, the probability that a letter will be replied to in τ w days is well approximated by a power law (3) with α = 3/2, the scaling spanning four orders of magnitude, from days to years. Note that this exponent is significantly different from α = 1 observed in the earlier datasets, and we will show later that modeling efforts indeed establish α = 3/2 as a scaling exponent characterizing human dynamics.
The dataset allows us to determine the interevent times as well, representing the time interval between two consecutive letters sent by Einstein, Darwin or Freud to any recipient. We find that the interevent time distribution is also heavy tailed, albeit the quality of scaling is not as impressive as we observe for the response time distribution. This is due to the fact that we do not know the precise time when the letter is written (in contrast with the email, which is known with second resolution), but only the day on which it was mailed. Given that both Einstein and Darwin wrote at least one letter most days, this means that long interevent times are rarely observed. Furthermore, owing to the long observational period (over 70 years), the overall activity pattern of the two scientists has changed significantly, going from a few letters per year to as many 400-800 letters/year during the later, more famous phase of their professional life. Thus the interevent time, while it appears to follow a power law distribution, it is by no means stationary. More stationarity is observed, however, in the response time distribution.

C. The stock broker activity pattern
For the stock broker we again focus on the interevent time distribution, finding that the best fit follows P (τ ) ∼ τ −α exp(−τ /τ 0 ) with α = 1.3 and τ 0 = 76 min (see Fig.  1d). This value is between α = 1 observed for the users in the first three other datasets and α = 3/2 observed for the correspondence patterns. Yet, given the scattering of the measured exponents, it is difficult to determine if this represents a standard statistical deviation from α = 1 or α = 3/2, the two values expected by the modeling efforts (see Sects. V and VI), or it stands as evidence for a new universality class. At this point we believe that the former case is valid, something that can be decided only once data for more users will become available. The exponential cutoff is not inconsistent with the modelling efforts either: as we will show in Appendix C, such cutoffs are expected to accompany all human activity patterns with α < 2.

D. Qualitative differences between heavy tailed and Poisson activity patterns
The heavy tailed nature of the observed interevent time distribution has clear visual signatures. Indeed, it implies that an individual's activity pattern has a bursty character: short time intervals with intensive activity (bursts) are separated by long periods of no activity (Figs. 1d-f). Therefore, in contrast with the relatively uniform activity pattern predicted by the Poisson process, for a heavy tailed process very dense successions of events (bursts) are separated by very long gaps, predicted by the slowly decaying tail of the power law distribution. This bursty activity pattern agrees with our experience of an individual's normal email usage pattern: during a single session we typically send several emails in quick succession, followed by long periods of no email activity, when we focus on other activities.

IV. CAPTURING HUMAN DYNAMICS: QUEUING MODELS
The empirical evidence discussed in the previous section raises several important questions: Why does the Poisson process fail to capture the temporal features of human activity? What is the origin of the observed heavy tailed activity patterns in human dynamics? To address these questions we need to inspect closely the processes that contribute to the timing of the events in which an individual participates.
Most of the time humans face simultaneously several work, entertainment, and family related responsibilities. Indeed, at any moment an individual could choose to participate in one of several tasks, ranging from shopping to sending emails, making phone calls, attending meetings or talks, going to a theater, getting tickets for a sports event, and so on. To keep track of the various responsibilities ahead of them, individuals maintain a to do or priority list, recording the upcoming tasks. While this list is occasionally written or electronically recorded, in many cases it is simply kept in memory. A priority list is a dynamic entity, since tasks are removed from it after they are executed and new tasks are added continuously. The tasks on the list compete with each other for the individual's time and attention. Therefore, task management by humans is best described as a queuing process [31,32], where the queue represents the tasks on the priority list, the server is the individual which executes them and maintains the list, and some selection protocol governs the order in which the tasks are executed. To define the relevant queuing model we must clarify some key features of the underlying queuing process, ranging from the arrival and service processes to the nature of the task selection protocol, and the restrictions on the queue length [31]. In the following we discuss each of these ingredients separately, placing special emphasis on their relevance to human dynamics.
Server: The server refers to the individual (or agent) that maintains the queue and executes the tasks. In queuing theory we can have one or several servers in parallel (like checkout counters in a supermarket). Human dynamics is a single server process, capturing the fact that an individual is solely responsible for executing the tasks on his/her priority list.
Task Arrival Pattern: The arrival process specifies the statistics of the arrival of new tasks to the queue. In queuing theory it is often assumed that the arrival is a Poisson process, meaning that new tasks arrive at a constant rate λ to the queue, randomly and independently from each other. We will use this approximation for human queues as well, assuming that tasks land at random times on the priority list. If the arrival process is not captured by a Poisson distribution, it can be modeled as a renewal process with a general distribution of interarrival times [31]. For example, our measurements indicate that the arrival time of emails follows a heavy tailed distribution, thus a detailed modeling of email based queues must take this into account. We must also keep in mind that the arrival rate of the tasks to the list is filtered by the individual, who decides which tasks to accept and place on the priority list and which to reject. In principle the rejection of a task is also a decision process that can be modeled as a high priority short lived task.
Service process: The service process specifies the time it takes for a single task to be executed, such as the time necessary to write an email, explore a web page or read a book. In queuing theory the service process is often modeled as a Poisson process, which means that the distribution of the time devoted to the individual tasks has the exponential form (2). However, in some applications the service time may follow some general distribution. For example, the size distribution of files transmitted by email is known to be fat tailed [33,34], suggesting that the time necessary to review (read) them could also follow a fat tailed distribution. In queuing theory it is often assumed that the service time is independent of the task arrival process or the number of tasks on the priority list. While we adopt this assumption here as well, we must also keep in mind that the service time can decrease if too many tasks are in the queue, as humans may devote less time to individual tasks when they have many urgent things to do.
Selection protocol or queue discipline: The selection protocol specifies the manner in which the tasks in the queue are selected for execution. Most human initiated events require an individual to weight and prioritize different activities. For example, at the end of each activity an individual needs to decide what to do next: send an email, do some shopping or place a phone call, allocating time and resources for the chosen activity. Normally individuals assign to each task a priority parameter, which allows them to compare the urgency of the different tasks on the list. The time a task waits before it is executed depends on the method the agent uses to choose the task to be executed next. In this respect three selection protocols are particularly relevant for human dynamics: (i) The simplest is the first-in-first-out (FIFO) protocol, executing the tasks in the order they were added to the list. This is common in service oriented processes, like the first-comefirst-serve execution of orders in a restaurant or getting help from directory assistance and consumer support.
(ii) The second possibility is to execute the tasks in a random order, irrespective of their priority or time spent on the list. This is common, for example, in educational settings, when students are called on randomly, and in some packet routing protocols.
(iii) In most human initiated activities task selection is not random, but the individual tends to execute always the highest priority item on his/her list. The resulting execution dynamics is quite different from (i) and (ii): high priority tasks will be executed soon after their addition to the list, while low priority items will have to wait until all higher priority tasks are cleared, forcing them to stay longer on the list. In the following we show that this selection mechanism, practiced by humans on a daily basis, is the likely source of the fat tails observed in human initiated processes.
Queue Length or System Capacity: In most queuing models the queue has an infinite capacity and the queue length can change dynamically, depending on the arrival and the execution rate of the individual tasks. In some queuing processes there is a physical limitation on the queue length. For example, the buffers of Internet routers have finite capacity, so that packets arriving while the buffer is full are systematically dropped. In human activity one could argue that, given the possibility to maintain the priority list in a written or electronic form, the length of the list has no limitations. Yet, if confronted with too many responsibilities, humans will start dropping some tasks and not accept others. Furthermore, while keeping track of a long priority list is not a problem for an electronic organizer, it is well established that the immediate memory of humans has finite capacity of about seven tasks [35,36]. In other words, the number of priorities we can easily remember, and therefore the length of our priority list, is bounded. These considerations force us to inspect closely the difference between finite and an unbounded priority lists, and the potential consequences of the queue length on the the waiting time distribution.
In this paper we follow the hypothesis that the empirically observed heavy tailed distributions originate in the queuing process of the tasks maintained by humans, and seek appropriate models to explain and quantify this phenomenon. Particularly valuable are queuing models that do not contain power law distributions as inputs, and yet generate a heavy tailed output. In the following we will focus on priority queues, reflecting the fact that humans most likely choose the tasks based on their priority for execution.
In the empirical datasets discussed in Sect III we focused on both the interevent time and the waiting time distribution of the tasks in which humans participate. In the following two sections we focus on the waiting time of a task on the priority list rather than the interevent times. In this context he waiting time, τ w , represents the time difference between the arrival of a task to the priority list and its execution, thus it is the sum of the time a task waits on the list and the time devoted to executing it. In Sect. VII we will return to the relationship between the empirically observed interevent times and the waiting times predicted by the discussed models.

V. MODELS WITH VARIABLE QUEUE LENGTH: α = 3/2 UNIVERSALITY CLASS
Our first goal is to explore the behavior of priority queues in which there are no restrictions on the queue length. Therefore, in these models an individual's priority list could contain arbitrary number of tasks. As we will show below, such models offer a good approximation to the surface mail correspondence patterns, such as that observed in the case of Einstein, Darwin and Freud (see Sect. III B). Therefore, we will construct the models with direct reference to the the datasets discussed in Sect. III. We assume that letters arrive at rate λ following a Poisson process with exponential arrival time distribution. Replacing letters with tasks, however, provides us a more general model, in principle applicable to any human activity. The responses are written at rate µ, reflecting the overall time a person devotes to his correspondence. Each letter is assigned a discrete priority parameter x = 1, 2, . . . , r upon arrival, such that always the highest priority unanswered letter (task) will be always chosen for a reply. The lowest priority task will have to wait the longest before execution, and therefore it dominates the waiting time probability density for large waiting times. This model was introduced in 1964 by Cobham [37] to describe some manufacturing processes. Most of the analytical work in queuing theory has concentrated on the waiting time of the lowest priority task, finding that the waiting time distribution follows [38] where A and τ 0 are functions of the model parameters, the characteristic waiting time τ 0 being given by where ρ = λ/µ is the traffic intensity. Therefore, the waiting time distribution is characterized by a power law decay with exponent α = 3/2, combined with an exponential cutoff. The model can be extended to the case where the priorities are not discrete, but take up continuous values 0 ≤ x < ∞ from an arbitrary η(x) distribution. The Laplace transform of the waiting time distribution for this case has been calculated in Ref. [31], but the resulting equation is difficult to invert, forcing us to study the model numerically (Fig. 5). The natural control parameter is ρ = λ/µ, allowing us to distinguish three qualitatively different regimes: Subcritical regime, ρ < 1: Given that the arrival rate of the tasks is smaller than the execution rate, the queue will be often empty. This significantly limits the waiting time, most tasks being executed soon after their arrival. The simulations indicate that the waiting time distribution exhibits an asymptotic scaling behavior consistent with (4) (Fig. 5). While in the ρ → 0 limit we observe mainly the exponential decay, as ρ approaches 1 a power law regime with exponent α = 3/2 emerges, combined with the exponential cutoff. V with continuous priorities. The numerical simulations were performed as follows: At each step we generate an arrival τa and service time τs from an exponential distribution with rate λ and µ, respectively. If τa < τs or there are no tasks in the queue then we add a new task to the queue, with a priority x ∈ [0, 1] from uniform distribution, and update the time t → t + τa. Otherwise, we remove from the queue the task with the largest priority and update the time t → t + τs. The waiting time distribution is plotted for three ρ = λ/µ values: ρ = 0.9 (circles), ρ = 0.99 (squares) and ρ = 0.999 (diamonds). The data has been rescaled to emphasize the scaling behavior P (τw) = τ In the inset we plot the distribution of waiting times for ρ = 1.1, after collecting up to 10 4 (plus) and 10 5 (diamonds) executed tasks, showing that the distribution of waiting times has a power law tail even for ρ > 1 (supercritical regime). Note, however, that in this regime a high fraction of tasks are never executed, staying forever on the priority list whose length increases linearly with time, a fact that is manifested by a shift to the right of the cutoff of the waiting time distribution.
Critical regime, ρ = 1 : When the arrival and the response rate of the letters are equal, according to (4) and (5) we should observe a power law waiting time distribution with α = 3/2 (Fig. 5). This regime would imply that, for example, Darwin responds to all letters he receive, which is not the case, given that their response rate is 0.32 (Darwin), 0.24 (Einstein) and 0.31 (Freud) [6]. In this case it is easy to show that the queue length performs a one-dimensional random walk bounded at l = 0. These fluctuations in the queue length will limit the waiting time distribution, as the tasks will wait at most as long as it takes for the queue length to return to l = 0. Therefore, the waiting time distribution will have as upper bound the return time distribution of a one-dimensional random walk. It is known, however, that the return time distribution of a random walker follows P (t) ∼ t −3/2 [39,40], which is the origin of the 3/2 exponent in Eq. (3). This argument indicates that (4) is related to the fluctuations in the length of the priority list.
Supercritical regime, ρ > 1 : Given that in this regime the arrival rate exceeds the response rate, the average queue length grows linearly as l(t) = (λ − µ)t. Therefore, a 1 − 1/ρ fraction of the letters is never responded to, waiting indefi-nitely in the queue. Given Darwin, Einstein and Freud's small response rate, this regime captures best their correspondence pattern. We can measure the waiting time for each letter that is responded to. In Fig. 5 we show the waiting time probability density obtained from numerical simulations, indicating that it follows a power law with exponent α = 3/2. Thus the supercritical regime follows the same scaling behavior as the critical regime, but only for the letters that are responded to. The rest of the letters wait indefinitely in the list (τ w = ∞).
While the discussed model can indeed generate power law waiting time distributions, a critical comparison with the empirical datasets reveals some notable deficiencies. First, a power law distribution emerges only in the critical (ρ = 1) and the supercritical (ρ > 1) regimes. The critical regime requires a careful tuning of the human execution rate, so that the execution and the arrival rates are exactly the same. In contrast, for ρ > 1 no tuning is necessary, but the number of tasks on the list increases linearly with time, thus many tasks are never executed. This limit is probably the most realistic for human dynamics: we often take on tasks that we never execute, and technically stay on our priority list forever. As we discussed above, this is documentedly the case for Einstein, Darwin and Freud, who answer only a fraction of their letters. However, we must not overlook the second important feature of the discussed model: the only exponent it can predict is α = 3/2, rooted in the fluctuations of the queue length. While this fully agrees with the correspondence patterns of Einstein, Darwin and Freud, it is significantly higher than the values observed in the empirical data discussed in Sect. III A on web browsing, email communications or library visits, which we found to be scattered around α = 1.

VI. MODELS WITH FIXED QUEUE LENGTH: α = 1 UNIVERSALITY CLASS
To understand the limitations of the model discussed in the previous section we must remember that when the arrival and execution rates are equal (ρ = 1) the length of the priority list follows a random walk, and can thus occasionally take up very large values. The situation is even worse for ρ > 1, when the queue length increases linearly with time. Therefore, according to the model an individual must have the capacity to keep track of hundreds or thousands of tasks at the same time. This may be appropriate for surface mail, where the letters pile on our desk until replied to. In contrast, there is extensive evidence from the psychology literature that the number of tasks humans can easily keep in their short term memory is bounded [35], therefore it is unrealistic that we will remember hundreds or thousands of tasks at any given time. This forces us to inspect a model in which the length of the priority list remains unchanged [5], a new task being added only when an old task is removed from the list (executed).
We assume that an individual mantains a priority list with L tasks, each task being assigned a priority parameter x i , i = 1, ..., L, chosen from an η(x) distribution. At each time step with probability p the individual selects the highest priority task and executes it, removing it from the list. At that moment a new task is added to the list, its priority x i is again chosen from η(x), thus the length L of the list remains unchanged. With probability 1 − p the individual executes a randomly selected task, independent of its priority. The p → 1 limit of the model describes the deterministic highest-priority-first protocol, when always the highest priority task is chosen for execution, while p → 0 corresponds to the random choice protocol, introduced to mimic the fact that humans occasionally select some low priority items for execution, before all higher priority items are executed. In the model time is discrete, each task execution corresponding to one unit of time. Implicit in this assumption is the approximation that the service time distribution follows a delta function, i.e., each task takes one unit time to execute.
To understand the dynamics of the model we first study it via numerical simulations with priorities chosen from a uniform distribution x i ∈ [0, 1]. The simulations show that in the p → 1 limit the probability that a task spends τ w time on the list has a power law tail with exponent α = 1 (Fig. 6a). In the p → 0 limit P (τ w ) follows an exponential distribution (Fig. 6a), as expected for the random selection protocol. As the typical length of the priority list differs from individual to individual, it is important for the tail of P (τ w ) to be independent of L. Numerical simulations indicate that this is indeed the case: changes in L do not affect the scaling of P (τ w ) [5]. The fact that the scaling holds for L = 2 as well indicates that it is not necessary to have a long priority list: even if an individual balances only two tasks at the same time, a bursty heavy tailed interevent dynamics will emerge. Next we focus on the L = 2 case, for which the model can be solved exactly, providing important insights into its scaling behavior that can be generalized for arbitrary L values as well.

A. Exact solution for L = 2
For L = 2 the waiting time distribution can be determined exactly [8] (see Appendix B), obtaining independent of η(x) from which the task priorities are selected. In the limit p → 0 from (6) follows that i.e. P (τ w ) decays exponentially, in agreement with the numerical results (Fig. 6a). This limit corresponds to the random selection protocol, where a task is selected with probability 1/2 in each step. In the p → 1 limit we obtain In this case almost all tasks have a waiting time τ w = 1, being executed as soon as they were added to the priority list. The waiting time of tasks that are not selected in the first step follows a power law distribution, decaying with α = 1. This behavior is illustrated in Fig. 6a by a direct plot of P (τ w ) in (6) for a uniform distribution η(x) in 0 ≤ x ≤ 1. For p < 1 the P (τ w ) distribution has an exponential cutoff, which can be derived from (6) after taking the τ w → ∞ limit with p fixed, resulting in where τ 0 = ln When p → 1 we obtain that τ 0 → ∞ and, therefore, the exponential cutoff is shifted to higher τ w values, while the power law behavior P (τ w ) ∼ 1/τ w becomes more prominent. The P (τ w ) curve systematically shifts, however, to lower values for τ w > 1, indicating that the power law applies to a vanishing task fraction (see Fig. 6a and (9)). In turn, P (1) → 1 when p → 1, corroborated by the direct plot of P (1) as a function of p (see inset of Fig. 6a).

B. Numerical results for L > 2
Based on the results discussed above, the overall behavior of the model with a uniform priority distribution can be summarized as follows. For p = 1, corresponding to the case when always the highest priority task is removed, the model does not have a stationary state. Indeed, each time the highest priority task is executed, there is a task with smaller priority x m left on the list. With probability 1 − x m the newly added task will have a priority x ′ m larger than x m , and will be executed immediately. With probability x m , however, the new task will have a smaller priority, in which case the older task will be executed, and the new task will become the 'resident' one, with a smaller priority x ′ m < x m . For a long period all new tasks will be executed right away, until an another task arrives with probability x ′′ m that again pushes the nonexecuted priority to a smaller value x ′′ m < x ′ m . Thus with time the priority of the lowest priority task will converge to zero, x m (t) → 0, and thus with a probability converging to one the new task will be immediately executed. This convergence of x m to zero implies that for p = 1 the model does not have a stationary state. A stationary state develops, however, for any p < 1, as in this case there is always a finite chance that the lowest priority tasks will also be executed, thus the value of x m will be reset, and will converge to some x m (p) > 0. This qualitative description applies for arbitrary L > 2 values.
To quantify this qualitative picture we studied numerically the L > 2 case assuming that η(x) is uniformly distributed in the 0 ≤ x ≤ 1 interval. To investigate how fast the system approaches the stationary state we compute the average priority of the lowest priority task in the queue, x min (t) (see Fig. 7a,b) since it represents a lower bound for the average of any other priorities on the list. We find that for any L values x min (t) decreases exponentially up to a time scale t 0 , when it reaches a stationary value x min (∞) . The numerical simulations indicate that For L = 2 can calculate x min (∞) exactly [8], obtaining and therefore θ 2 = 1. For L > 2 we determined θ L from the best data collapse, obtaining the values shown in the inset of Fig. 7b, indicating that where θ 3 = 0.22 is the value of θ L for L = 3. These results support our qualitative discussion, indicating that for all L ≥ 2 and 0 ≤ p < 1 values the system reaches a stationary state.
Finally we measured the waiting time distribution after the system has reached the stationary state. The results for L = 3 are shown in Fig. 7c, and similar results were obtained for other L > 2 values. The data collapse of the numerically obtained P (τ ) indicates that when L > 2 and τ ≫ 1, where in the p → 1 limit. The simulations indicate that the model's behavior for L > 2 is qualitatively similar to the behavior derived exactly for L = 2, but different scaling parameters characterize the scaling functions. For any L ≥ 2, however, the waiting times scale as P (τ w ) ∼ τ −1 w , i.e. we have α = 1.

C. Comparison with the empirical data
As the results in the previous subsections show, the model proposed to account for the α = 1 universality class has some apparent problems. Indeed, for truly deterministic execution (p = 1) the model does not have a stationary state. The problem was cured by introducing a random task execution (p < 1), which leads to stationarity. In this case, however, a p dependent fraction of tasks are executed immediately, and only the rest of the long lived tasks follow a power law. As p converges to zero, the fraction of tasks executed immediately diverges, developing a significant gap between the power law regime, and the tasks displaying τ = 1 waiting time. Is this behavior realistic, or represents an artifact of the model? A first comparison with the empirical data would suggest that this is indeed an artifact, as measurements shown in Fig. 2 and  3 do not provide evidence of a large number of tasks that are immediately executed. However, when inspecting the measurement results we should keep in mind that they represents the intervent times, and not the waiting times. In the case when the waiting time can be directly measured, like in the email or mail based correspondence, there is some ambiguity to the real waiting time. Indeed, in the email data, for example, we have measured as waiting time the time difference between the arrival of an email, and the response sent to it. While this offers an excellent approximation, from an individual's or a priority queue's perspective this is not the real waiting time. Indeed, consider the situation when an email arrives at 9:00 am, and the recipient does not check her email until 11:56am, at which point she replies to the email immediately. From the perspective of her priority list the waiting time was less than a minute, as she replied as soon as she saw the email. In our dataset, however, the waiting time will be 3 hours and 56 minutes. Thus the way we measured the waiting times cannot identify the true waiting time of a task on a user's priority list. The email dataset allows us, however, to get a much better approximation of the real waiting times than we did before. Indeed, for an email e 1 received by user A we record the time t 1 it arrives, and then the time t 2 of the first email sent by user A to any other user after the arrival of the selected email. It will be this time from which we start measuring the waiting time for email e 1 . Thus if user A replies to e 1 at time t 3 , we consider that the email's waiting time τ real = t 3 − t 2 , instead of t 3 − t 1 considered in Fig  3a. The results, shown in Fig 3c, displays the same power law scaling with α = 1 as we have seen in Fig 3a, but in addition there is a prominent peak at τ real = 1, cooresponding to emails responded to immediately. Note that the peak's magnitude is orders of magnitude larger than the probabilities displayed by the large waiting times. This result suggests that what we could have easily considered a model artifact in fact captures a common feature of email communications. Indeed, a high fraction of our emails is responded immediately, right after our first chance to read them, as predicted by the priority model discussed in this section. Are there models that can provide the α = 1 universality class without the high fraction of items executed imediatelly? While we have failed to come up with any examples, we belive that developing such models could be quite valuable.

VII. RELATIONSHIP BETWEEN WAITING TIMES AND INTEREVENT TIMES
As we discussed above, the empirical measurements provide either the interevent time distribution P (τ ) (Sects. III A and III C) or the waiting time distribution P (τ w ) (Sect. III B) of the measured human activity patterns. In contrast the model predicts only the waiting time τ w of a task on an individual's priority list. What is the relationship between the observed interevent times and the predicted waiting times? The basic thesis of our paper is that the waiting times the various tasks experience on an individual's priority list is responsible for the heavy tailed distributions seen in the interevent times as well. The purpose of this section is to discuss the relationship between the two quantities.
The model predictions, that the waiting time distribution of the tasks follows a power law, is directly supported by one dataset in each universality class: the email data and the correspondence data. As discussed in Sect. III, we have measured the waiting time distribution for both datasets, finding that the distribution of the response times indeed follows a power law with exponent α = 1 (email) and α = 3/2 (correspondence mail) as predicted by the models. Therefore, the direct measurement of the waiting times are likely rooted in the fat tailed response time distribution. For the other three datasets, however, such as web browsing, library visits and stock purchases, we cannot determine the waiting time of the individual events, as we do not know when a given task is added to the individual's priority list.
To explore the broader relationship between the waiting times and the interevent times we must remind ourselves that while during the measurements we are focusing on a specific task (like email), the models assume the knowledge of all tasks that an individual is involved in. Thus the empirical measurements offer only a selected subset of an individual's activity pattern. To see the relationship between τ and τ w next we discuss two different approaches.
Queueing of different task categories: The first approach acknowledges the fact that tasks are grouped in different categories of priorities: we often do not keep in mind specific emails to be answered, but rather remember that we need to check our email and answer whatever needs attention. Similarly, we may remember a few things that we need to shop for, but our priority list would often contain only one item: go to the supermarket. When we monitor different human activity patterns, we see the repetitive execution of these categories, like going to the library, or doing emails, or browsing the web. Given this, one possible modification of the discussed models would assume that the tasks we monitor correspond to specific activity categories, and when we are done with one of them, we do not remove it from the list, but we just add it back with some changed priority. That is, checking our email does not mean that we deleted email activity from our priority list, but only that next has some different priority. If we monitor only one kind of activity, then a proper model would be the following: we have L tasks, each assigned a given priority. After a task is executed, it will be reinserted in the queue with a new priority chosen from the same distribution η(x). If we now monitor the time at which the different tasks exit the list, we will find that the interevent times for the monitored tasks correspond exactly to the waiting time of that task on the list. Note that this conceptual model would work even if the tasks are not immediately reinserted, but after some delay τ d . Indeed in this case the interevent time will be τ = τ w + τ d , and as long as the distribution from which τ d is selected from is bounded, the tail of the interevent time distribution will be dominated by the waiting time statistics.
Interaction between individuals: The timing of specific emails also depends on the interaction between the individuals that are involved in an email based communication. Indeed, if user A gets an email from user B, she will put the email into her priority list, and answer when she gets to it. Thus the timing of the response depends on two parameters: the receipt time of the email, and the waiting time on the priority list. Consider two email users, A and B, that are involved in an email based conversation. We assume that A sends an email to B as a response to an email B sent to A, and viceversa. Thus, the interevent time between two consecutive emails sent by user A to user B is given by τ = τ A w + τ B w , where τ A w is the waiting time the email experienced on user A's queue, and τ B w is the waiting time of the response of user B to A's email. If both users prioritize their tasks, then they both display the same waiting time distribution, i.e. P (τ A w ) ∼ (τ A w ) −α and P (τ B w ) ∼ (τ B w ) −α . In this case the interevent time distribution P (τ ), which is observed empirically if we study only the activity pattern of user A, follows also P (τ ) ∼ τ −α . Thus the fact that users communicate with each other turns the waiting time into an observable interevent times.
In summary, the discussed mechanisms indicate that the waiting time distribution of the tasks could in fact drive the interevent time distribution, and that the waiting time and the interevent time distributions should decay with the same scaling exponent. In reality, of course, the interplay between the two quantities can be more complex than discussed here, and perhaps even better mapping between the two measures could be found for selected activities. But these two mechanisms indicate that if the waiting time distribution is heavy tailed, we would expect that the interevent time distribution would be also affected by it.

VIII. DISCUSSION
Universality classes: As summarized in the introduction, the main goal of the present paper was to discuss the potential origin of the heavy tailed distributed interevent times observed in human dynamics. To start we provided evidence that in five distinct processes, each capturing a different human activity, the interevent time distribution for individual users follow a power law. Our fundamental hypothesis is that the observed interevent time distributions are rooted in the mechanisms that humans use to decide when to execute the tasks on their priority list. To support this hypothesis we studied a family of queuing models, that assume that each task to be executed by an individual waits some time on the individual's priority list and we showed that queuing can indeed generate power law waiting time distributions. We find that a model that allows the queue length to fluctuate leads to α = 3/2, while a model for which the queue length is fixed displays α = 1. These results indicate that human dynamics is described by at least two universality classes, characterized by empirically distinguishable exponents. Note that while we have classifed the models based on the limitations on the queue lenght, we cannot exclude the existence of models with fixed queue lenght that scale with α = 3/2, or models with fluctuating lenght that display scaling with α = 1, or some other exponents.
In comparing these results with the empirical data, we find that email and phone communication, web surfing and library visitation belong to the α = 1 universality class. The correspondence patterns of Einstein, Darwin and Freud offer convincing evidence for the relevance of the α = 3/2 exponent, and the related universality class, for human dynamics. In contrast the fourth process, capturing a stock broker's activity, shows α = 1.3. Given, however, that we have data only for a single user, this value is in principle consistent with the scattering of the exponents from user to user, thus we cannot take it as evidence for a new universality class. One issue still remains without a satisfactory answer: why does email and surface mail (Einstein, Darwin and Freud datasets) belong to different universality classes? We can comprehend why should the mail correspondence belong to the 3/2 class: letters likely pile on the correspondent's desk until they are answered, the desk serving as an external memory, thus we do not require to remember them all. But the same argument could be used to explain the scaling of email communications as well, given that unanswered emails will stay in our mailbox until we delete them (which is one kind of task execution). Therefore one could argue that email based communication should also belong to the 3/2 universality class, in contrast with the empirical evidence, that clearly shows α = 1 [5,24].
Some difficulty in comparing the empirical data with the model predictions is rooted in the fact that the models predict the waiting times, while for many real systems only the interevent times can be measured. It is encouraging, however, that for the email and the surface mail based commnunication we were able to determine directly the waiting times as well, and the exponents agreed with those determined from the interevent times. In addition we argued that in a series of processes the waiting time distribution determines the interevent time distribution as well (see Sect. VII). This argument closes the loop of the paper's logic, establishing the relevance of the discussed queueing models to the datasets for which only interevent times could be measured. We do not feel, however, that this argument is complete, and probably future work will strengthen this link. In this respect two directions are particularly promising. First, designing queueing models that can directly predict the observed interevent times as well would be a major advance. Second, establishing a more general link between the waiting time and interevent times could also be of significant value. The results discussed in this paper leave a number of issues unresolved. In the following we will discuss some of these, outlining how answering them could further our understanding of the statistical mechanics of human driven processes.
Tuning the universality class: As we discussed above, the discussed models provide evidence for two distinct universality classes in human dynamics, with distinguishable exponents. The question is, are there other universality classes, characterized by exponents different from 1 and 3/2? If other universality classes do exist, it would be valuable not only to find empirical support for them, but also to identify classes of models that are capable of predicting the new exponents.
In searching for new exponents we need to explore several different directions. First, if one inserts some power law process into the queuing model, that could tune the obtained waiting time distribution, and the scaling exponents. There are different ways to achieve this. One method, discussed in Appendix A, is based on the hypothesis that while we always attempt to select the highest priority task, circumstances or resource availability may not allow us to achieve this. For example, our highest priority may be to get cash from the bank, but we cannot execute this task when the bank is closed, moving on to some lower priority task. One way to account for this is to use a probabilistic selection protocol, assuming that the probability to choose a task with priority x for execution in a unit time is Π(x) ∼ x γ , where γ is a parameter that interpolates between the random choice limit (ii) (γ = 0, p = 0) and the deterministic case, when always the highest priority item is chosen for execution (iii) (γ = ∞, p = 1). As shown in the Appendix A, the exponent α will depend on γ as At this moment we do not have evidence that such preferential selection process acts in human dynamics. However, detailed datasets and proper measurement tools might help up decide this by measuring the function Π(x) directly, capturing the selection protocol. Such measurements were possible for com-plex networks, where a similar function drives the preferential attachment process [10,41,42,43,44,45]. As we discussed above, the main goal of this paper was to demonstrate that the queuing of the tasks on an individual's priority list can explain the heavy tailed distributions observed in human activity patterns. To achieve this, we focused on models with Poisson inputs, meaning that both the arrival time and the execution time are bounded. In some situations, however, the input distributions can be themselves heavy tailed. This could have two origins: (i) Heavy tailed arrival time distribution: As we show in Fig. 3b, there is direct evidence for this in the email communication datasets: we find that the interevent time of arriving emails can be roughly approximated with a power law with exponent α in = 1. (ii) The execution time could also be heavy tailed, describing the situation when most tasks are executed very rapidly, while a few tasks require a very long execution time. Evidence for this again comes from the email system: the file sizes transmitted by email are known to follow a heavy tailed distribution [33,34]. Therefore, if we read every line of an email, in principle the execution time should also be heavy tailed (i.e. the time we actually take to work on the response, including reading the original email). Note, however, that measurements failed to establish a correlation between email size and the response time [5]. It is not particularly surprising that both (i) and (ii) would significantly impact the waiting time distribution, generating a heavy tailed distribution for the waiting times even when the behavior of the model otherwise would be exponential, or change the exponent α, thus altering the model's universality class. Some aspects of this problem were addressed recently by Blanchard and Hongler [46]. However, to illustrate the impact of the heavy tailed inputs in Appendix E we study the model of Sect. V with a heavy tailed service time distribution h(s) ∼ s −β with 0 < β < 1.
Finally, could the power law distributed arrival and execution times serve as the proper explanation for the observed heavy tailed interevent time distribution in human dynamics? Note that in a number of systems we observe heavy tailed distributed events without evidence for power law inputs. For example, the timing of the library visits or stock purchases by brokers does not appear to be driven by any known power law inputs, and they have negligible execution time compared with the average observed interevent times. Similarly, the beginning of online games or instant messages is not driven by file sizes either, but only by the time availability for playing a game or sending a message, which is mostly a priority driven issue. Therefore, while it is important to understand the impact of power law inputs on the scaling properties of various models, attempts to explain the waiting times solely based on the heavy tailed inputs only delegate the problem to an earlier cause (the origin of the power law input).
Potential model extensions: Guided by the desire of constructing the simplest models that capture the essence of task execution, we have neglected many processes that are obviously present in human dynamics. For example, we assumed that the priority of the tasks is assigned at the moment the task was added to an individual's priority list, and remains unchanged for the rest of the queuing process. In reality the priorities themselves can change in time. For example, many tasks have deadlines [46], and one could assume that a task's priority diverges as the deadline is approaches. Even in the absence of a clear deadline some priorities may incease in time [46], others may decrease. Sometimes external factors change suddenly a task's priority-for example, the priority of watering the lawn suddenly diminishes if it starts raining. The possibility of dropping tasks, either by not allowing them on the queue, or by simply deleting them from the queue, could also affect the waiting time distributions. Tasks could be dropped if they were not executed for a considerable time interval, and thus become irrelevant, or when the individual is very busy, or some may be simply forgotten. Obviously, the precise impact on the waiting time distribution will depend on the implementation of the task dropping conditions. It is important to understand if any or all of these processes could change the universality class of the waiting time distribution.
Model limitations: The studied datasets do not capture all tasks an individual is involved in, but only the timing of selected activities, like sending emails or borrowing books from the library. Yet, we must consider the fact that between any two recorded events individuals participate in many other nonrecorded activities. For example, if we find that an individual clicks on a new document every few seconds, likely he/she is fully concentrating on web browsing. However, when we notice a break of hours or days between two consecutive clicks, it is clear that in the meantime the individual was involved in a myriad of other activities that were not visible to us. The queuing models discussed here were designed to take into consideration all human activities, as we assume that the priority list of an individual contains all tasks the person is involved in. Currently an understanding of the interplay between the recorded and the invisible activities is still lacking.
Task optimization: The order in which we execute different tasks is often driven by optimization: we try to minimize the total time, or some cost functions. This is particularly relevant if the execution times depend on the order in which the tasks are executed. For example, often executing a certain task might be faster if we execute some other preparatory tasks before, and not in the inverse order. In principle optimization could be incorporated into the studied models by assuming that they determine the priority of the tasks. Optimization raises several important questions for future work: How should we model optimization driven queueing processes? Can they also lead to power laws, and if so, will they result in new universality classes?
Correlations: So far we have focused on the origin of the various distributions observed in human dynamics. Distributions offer little information, however, about potential correlations present in the observed time series. Such correlations were documented in Ref. [26], observing that the correlation function of the interevent time series for printing job arrivals decayed as a power law. Are such temporal correlations present in other systems as well? What is their origin? Can the queuing models predict such correations? Answers to these questions could not only help better understand human dynamics, but could also aid in distingushing the various models from each other.
Network effects: In seaching for the explanation for the observed heavy tailed human activity patterns we limited our study to the properties of single queues. In reality none of our actions are perforned independently-most of our daily activity is embedded in a web of actions of other individuals [47,48]. Indeed, the timing of an email sent to user A may depend on the time we receive an email from user B. An important future goal is to understand how the various human activities and their timing is affected by the fact that the individuals are embedded in a network environment.
Non-human activity patterns: Heavy tailed interevent time distributions do not occur only in human activity, but emerge in many natural and technological systems. For example, Omori's law on earthquaqes [49,50] records heavy tailed interevent times between consecutive seismic activities; measurements indicate that the fishing patters of seabirds also display heavy tailed statistics [51]; plasticity paterns [52] and avalanches in lungs [53] show similar power law interevent times. While a series of models have been proposed to capture some of these processes individually, there is also a possibility that some of these modeling frameworks can be reduced to various queuing processes. Some of the studied queuing models show close relationship to several models designed to capture self-organized criticality [54,55,56,57,58,59]. Could the mechanisms be similar at some fundamental level? Even if such higher degree of universality is absent, understanding the mechanisms and queuing processes that drive human dynamics could help us better understand other natural phenomena as well, from the timing of chemical reactions in a cell to the temporal records of major economic events [47] or the timing of events in manufacturing processes and supply chains [60,61,62] or panic [63].

APPENDIX A: PREFERENTIAL SELECTION PROTOCOL
As we discussed in Sect. VIII, one possible modification of the priority model introduced and studied in Sect. VI involves the assumption that we do not always choose the highest priority task for execution, but rather the tasks are chosen stochastically, in increasing function of their priority. That is, the probability to choose a task with priority x for execution in a unit time is Π(x) ∼ x γ , where γ is a predefined parameter of the model. This parameter allows us to interpolate between the random choice limit (ii) (γ = 0, p = 0) and the deterministic case, when always the highest priority item is chosen for execution (iii) (γ = ∞, p = 1). Note that this parameterization captures the scaling of the model discussed in Sect. VI only in the p → 0 and p → 1 limits, but not for intermediate p values. That is, the two limits of this model map into extreme limits of the model introduced in Sect. VI, but the intermediate p and γ values do not map into each other. The probability that a task with priority x waits a time interval t before execution is f (x, t) = (1 − Π(x)) t−1 Π(x). The average waiting time of a task with priority x is obtained by averaging over t weighted with f (x, t), providing i.e. the higher an item's priority, the shorter is the average time it waits before execution. To calculate P (τ ) we use the fact that the priorities are chosen from the η(x) distribution, i.e. η(x)dx = P (τ )dτ , which gives providing the relationship (16) between α and γ, and indicating that with changing γ we can continuously tune α as well. In the γ → ∞ limit, which converges to the strictly priority based deterministic choice (p = 1) in the model, Eq. (A2) predicts w(t w ) ∼ t −1 w , in agreement with the numerical results (Fig 3a), as well as the empirical data on the email interarrival times (Fig 2a). In the γ = 0 (p = 0) limit t w (x) is independent of x, thus w(t w ) converges to an exponential distribution, as shown in Fig. 3b.
The apparent dependence of w(t w ) on the η(x) distribution from which the agent chooses the priorities may appear to represent a potential problem, as assigning priorities is a subjective process, each individual being characterized by its own η(x) distribution. According to Eq. (A2), however, in the γ → ∞ limit w(t w ) is independent of η(x). Indeed, in the deterministic limit the uniform η(x) can be transformed into an arbitrary η ′ (x) with a parameter change, without altering the order in which the tasks are executed [31]. This insensitivity of the tail to η(x) explains why, despite the diversity of human actions, encompassing both professional and personal priorities, most decision driven processes develop a heavy tail.

APPENDIX B: EXACT SOLUTION OF THE PRIORITY QUEUE MODEL WITH L = 2
Consider the model discussed in Sect. VI [5] with L = 2 [8]. The task that has been just selected and its priority has been reassigned will be called the new task, while the other task will be called the old task. Let η(x) and R(x) = x 0 dxη(x) be the priority probability density function (pdf) and distribution function of the new tasks, which are given. In turn, letη(x, t) andR(x, t) = x 0 dxη(x, t) be the priority pdf and distribution function of the old task in the t-th step. At the (t + 1)-th step there are two tasks on the list, their priorities being distributed according to R(x) andR(x, t), respectively. After selecting one task the old task will have the distribution functioñ where is the probability that the new task is selected given the old task has priority x, and is the probability that the old task is selected given the new task has priority x. In the stationary state,R(x, t + 1) = R(x, t), thus from (B1) we obtaiñ . (B4) Next we turn our attention to the waiting time distribution. Consider a task with priority x that has just been added to the queue. The selection of this task is independent from one step to the other. Therefore, the probability that it waits τ w steps is given by the product of the probability that it is not selected in the first τ w − 1 steps and that it is selected in the τ w -th step. The probability that it is not selected in the first step isq(x), while the probability that it is not selected in the subsequent steps is q(x). Integrating over the new task's possible priorities we obtain (B6) Note that P (τ w ) is independent of the η(x) pdf from which the tasks are selected. Indeed, what matters for task selection is their relative order with respect to other tasks, resulting that all dependences in (B2)-(B4) and (B5) appears via R(x).

APPENDIX C: THE ASYMPTOTIC CHARACTERISTICS OF P (τw)
In Sect. VI we focused on a model with fixed queue length L, demonstrating that it belongs to a new universality class with α = 1. Next we derive a series of results that apply to any queuing model that has a finite queue length, and is characterized by an arbitrary task selection protocol [8]. In each time step there are L tasks in the queue and one of them is executed. Therefore where τ i is the waiting time of the task executed at the i-th step and τ ′ i , i = 1, . . . , L − 1, is the time interval that task i, that is still active at the t-th step, has already spent on the queue. The first term in the l.h.s. of (C1) corresponds to the sum of the waiting times experienced by the t tasks that were executed in the t steps since the beginning of the queue, while the second term describes the sum of the waiting times of the L − 1 tasks that are still on the queue after the t step. Given that in each time step each of the L tasks experience one time step delay, the sum on the l.h.s. should equal Lt. From (C1) it follows that If all active tasks have a chance to be executed sooner or later, like the case for the model studied in Sects. VI in the 0 ≤ p < 1 regime [5], we have τ ′ w ≤ τ w and the last term in (C2) vanishes when t → ∞. In contrast, for p = 1 the numerical simulations [5] indicate that after some transient time the most recently added task is always executed, while L−1 tasks remain indefinitely in the queue. In this case τ ′ i ∼ t in the t → ∞ limit and the last term in (C2) is of the order of L − 1. Based on these arguments we conjecture that the average waiting time of executed tasks is given by which is corroborated by numerical simulations (see Fig. 6b).
It is important to note that the equality in (C2) is independent of the selection protocol, allowing us to reach conclusions that apply beyond the model discussed in Sect. VI. From (C2) we obtain τ w ≤ L . (C4) From this constraint follows that P (τ w ) must decay faster than τ −2 w when τ w → ∞, otherwise τ w would not be bounded. Indeed, it is easy to see that for any α < 2 the average waiting time τ w diverges for Eq. (1). Thus, when τ w → ∞, we must either have where τ 0 > 0 and f (x) = O(bx α−2 ) when x → ∞, where b is a constant. That is, each time an α < 2 exponent is observed (as it is for the empirical data discussed in Sect. III), an exponential cutoff must accompany the scaling. For example, for the model discussed above with L = 2 and 0 ≤ p < 1 we have α = 1 and f (x) decays exponentially (9), in line with the constraint discussed above.

APPENDIX D: TRANSITIONS BETWEEN THE TWO UNIVERSALITY CLASSES
A basic difference between the models discussed in Sect. V and Sects. VI is the capacity of the queue. Our results indicate that the model without limitation on the queue length displays α = 3/2, rooted in the fluctuations of the queue length. In contrast, the model with fixed queue length (Sect. VI) has α = 1, rooted in the queuing of the low priority tasks on the priority list. If indeed the limitation in the queue length plays an important role, we should be able to develop a model that can display a transition from the α = 3/2 to the α = 1 universality class as we limit the fluctuations in the queue length. In this section we study such a model, interpolating between the two observed scaling regimes. We start from the model discussed in Sect. V, and impose on it a maximum queue length L. This can be achieved by altering the arrival rate of the tasks: when there are L tasks in the queue no new tasks will be accepted until at least one of the tasks is executed. Mathematically this implies that the arrival rate depends on the queue length as In the stationary state the queue length distribution P (ℓ) satisfies the balance equation λ ℓ−1 P (ℓ − 1) + µ ℓ+1 P (ℓ + 1) = (λ ℓ + µ ℓ ) P (ℓ) , (D2) where µ ℓ = 0 , ℓ = 0 µ , 0 < ℓ ≤ L .
From (D2) we obtain the queue length distribution as suggesting the existence of three scaling regions. Subcritical regime, ρ ≪ 1: If the arrival rate of the tasks is much smaller than the execution rate, the fact that the queue length has an upper bound has little significance, since ℓ will rarely reach its upper bound L, but will fluctuate in the vicinity of ℓ = 0. This regime can be reached either for ρ ≪ 1 and L fixed or for ρ < 1 and L ≫ 1. Therefore, in this case the waiting time distribution is well approximated by that of the model with an unlimited queue length, displaying the scaling predicted by Eq. (4), i.e. either exponential, or a power law with α = 3/2, coupled with an exponential cutoff (see Fig.  8a).
Critical regime: For ρ = 1 we observe an interesting interplay between the queue length and L. Normally in this critical regime ℓ(t) should follow a random walk with the return time probability density scaling with exponent 3/2. However, the limitation imposed on the queue length limits the power law waiting time distribution predicted by Eq. (4), introducing a cutoff (see Fig. 8b). Indeed having the number of tasks in the queue limited allows each task to be executed in a finite time.
Supercritical regime: When ρ ≫ 1 from (D4) follows that i.e. with probability almost one the queue is filled. Thus, in the supercritical regime ρ ≫ 1 new tasks are added to the queue immediately after a task is executed. If we take the number of executed tasks as a new reference time then this model corresponds to the one discussed in Sect. VI, displaying α = 1 [5], as supported by the numerical simulations (see Fig. 8a).

APPENDIX E: HEAVY TAILED INPUT DISTRIBUTIONS
In this Appendix we study the model discussed in Sec. V with a heavy tailed service time distribution h(s) ∼ s −β with 0 < β < 1. In this case it has been shown that [38] P (τ w ) ∼ τ −β w .
This result is a consequence of the generalized limit theorem for heavy tailed distributions [1]. Let us focus on a selected task and assume that m tasks need to be executed before it. Therefore, the selected task's waiting time is given by where s l is the service time of the l-th task executed before the given task. Equation (E2) represents the sum of m independent and identically distributed random variables, with pdf h(s) ∼ s −β , which is known to follow a pdf with the same heavy tail, and thus resulting in (E1). Hence, in this case the heavy tail in the waiting time distribution is a consequence of the heavy tails in the service time distribution.