Kendall’s Notation serves as a foundational language in the field of queuing theory, providing a concise and standardized method for classifying and describing a wide array of queuing systems. Developed by David G. Kendall in 1953, this notation simplifies the complex characteristics of a queue into a six-part code, making it universally understandable for researchers, practitioners, and students alike. Its primary purpose is to abstract the essential features of a waiting line system, allowing for the application of appropriate mathematical models and the subsequent analysis of performance metrics such as average waiting time, queue length, and system utilization.
The power of Kendall’s Notation lies in its ability to encapsulate the stochastic processes governing customer arrivals and service times, the structural layout of the service facility, and the operational rules that dictate how customers are managed within the system. By clearly defining these parameters, it enables a systematic approach to understanding, predicting, and optimizing the flow of entities through various service points, ranging from supermarket checkout lines and call centers to manufacturing production lines and computer networks. This standardized classification is indispensable for effective communication and comparison of different queuing scenarios, providing a bedrock upon which the entire edifice of queuing analysis is built.
Understanding Kendall's Notation Components
Kendall’s Notation typically takes the form A/B/C/D/E/F, where each letter represents a distinct characteristic of the queuing system. While often abbreviated to A/B/C, especially when the last three parameters are assumed to be standard (infinite capacity, infinite calling population, and first-come, first-served discipline), a comprehensive understanding requires delving into each of the six components.
A: Arrival Distribution
The first component, ‘A’, specifies the probability distribution of the inter-arrival times of customers. Inter-arrival time is the duration between the arrival of one customer and the arrival of the next. Understanding this distribution is critical because it dictates the rate and variability at which demands are placed on the system.
- M (Markovian / Poisson): This is the most commonly encountered distribution for arrivals. It signifies that inter-arrival times follow an exponential distribution, which implies that the arrival process is a Poisson process. A key property of the exponential distribution is its memorylessness, meaning the probability of an arrival in the next instant is independent of how long it has been since the last arrival. This simplifies mathematical analysis considerably, making M/M/x queues solvable analytically. Many real-world systems, especially those with a large number of independent potential arrivals (e.g., calls to a customer service center, customers arriving at a store), can be reasonably approximated by a Poisson arrival process.
- D (Deterministic): This indicates that inter-arrival times are constant and fixed. For instance, if customers arrive precisely every 10 minutes, the inter-arrival time is deterministic. This scenario is less common in natural, unforced human systems but can be found in controlled environments such as scheduled bus services, automated manufacturing lines with fixed cycle times, or paced production.
- G (General): This denotes that the inter-arrival times follow any arbitrary general probability distribution. When ‘G’ is used, it means no specific distributional assumption (like exponential or deterministic) is made beyond the existence of a mean and variance. Analyzing ‘G’ systems is generally more complex and often requires numerical methods, simulation, or more advanced mathematical techniques as the memoryless property of ‘M’ is lost. However, it offers greater realism for situations where arrival patterns are irregular or exhibit specific characteristics not captured by simpler distributions.
- Ek (Erlang): Less commonly used in the primary A/B/C notation but significant, the Erlang distribution is a special case of the Gamma distribution and is often used to model inter-arrival times that are more regular than exponential but not purely deterministic. An Erlang-k distribution is the sum of k independent and identically distributed exponential random variables. As k increases, the Erlang distribution approaches a deterministic distribution, while Erlang-1 is equivalent to the exponential distribution.
The choice of arrival distribution significantly impacts the variability of the input stream, which in turn affects queue lengths and waiting times. High variability (e.g., M) generally leads to longer queues and waiting times compared to low variability (e.g., D) for the same average arrival rate.
B: Service Distribution
The second component, ‘B’, describes the probability distribution of the service times. Service time is the duration a customer spends receiving service from a server. Similar to the arrival distribution, this parameter is crucial for understanding the capacity and variability of the service mechanism.
- M (Markovian / Exponential): This implies that service times follow an exponential distribution. Just like for arrivals, this provides the memoryless property, meaning the remaining service time for a customer is independent of how much service they have already received. This property greatly simplifies mathematical modeling and is often assumed in many analytical queuing models due to its tractability. Examples include customer service interactions where service duration can be highly variable, or repair times for machinery.
- D (Deterministic): This indicates that service times are constant and fixed for every customer. This might be found in automated processes, assembly lines where each task takes a precise amount of time, or certain digital processes like data transfer at a fixed bit rate. Deterministic service times generally lead to more predictable system behavior and shorter queues.
- G (General): This signifies that service times follow any arbitrary general probability distribution. Similar to ‘G’ for arrivals, this offers greater flexibility and realism when service times do not fit standard exponential or deterministic patterns. Analyzing ‘G’ systems often requires more sophisticated techniques.
- Ek (Erlang): As with arrivals, the Erlang distribution can be used for service times, particularly when service processes involve multiple stages, each taking an exponentially distributed time.
The interplay between arrival and service variability is central to queuing theory. High variability in both (e.g., M/M/x) leads to the most pronounced queuing effects, while deterministic processes (e.g., D/D/x) represent the ideal, often queue-free scenario given sufficient capacity.
C: Number of Servers
The third component, ‘C’, specifies the number of parallel service channels or servers available in the system. This is an integer representing the capacity of the service facility to handle customers simultaneously.
- 1 (Single Server): This is the simplest and most commonly analyzed configuration, where only one customer can be served at a time. Examples include a single teller at a bank, a single doctor’s office, or a one-person hotdog stand.
- c (Multiple Servers): This denotes a finite number of ‘c’ parallel servers, where ‘c’ is an integer greater than one. Each server can handle one customer at a time, and customers typically wait in a single queue before being dispatched to the first available server. Examples include multiple checkout counters in a supermarket, a bank with several tellers, or a call center with numerous agents. Increasing the number of servers generally reduces average waiting times and queue lengths, provided the total service capacity exceeds the arrival rate.
- ∞ (Infinite Servers): While theoretically infinite, this practically implies that there are always enough servers available to ensure that no customer ever has to wait for service. Every arriving customer immediately finds a free server. This model is useful for scenarios where the “server” is a resource that is always readily available or the processing capacity is so vast that congestion is negligible. Examples include parking lots with ample space for every arriving car, self-service facilities, or a highly over-provisioned cloud computing service where processing tasks can be handled without delay.
The number of servers directly dictates the system’s potential throughput and its ability to absorb fluctuations in demand. A system with more servers can handle higher arrival rates before becoming unstable, and for a given arrival rate, it will typically exhibit lower waiting times and shorter queues.
D: System Capacity (K)
The fourth component, ‘D’, represents the maximum number of customers allowed in the system, including those being served and those waiting in the queue. This is often referred to as the system’s “buffer size” or “queue capacity.”
- K (Finite Capacity): When ‘K’ is a finite integer, it means the system has a limited capacity. If an arriving customer finds the system full (K customers already present), that customer is either blocked from entering, leaves the system (balks), or is lost. This often occurs in physical systems with limited waiting areas, such as a small waiting room in a clinic, a production buffer of a specific size, or a telecommunications switch with a finite number of lines. Finite capacity systems are important for modeling lost sales or calls in business contexts.
- ∞ (Infinite Capacity): This is the default assumption if ‘D’ is omitted from the notation. It implies that there is no limit to the number of customers who can wait in the queue or be served in the system. While no real-world system truly has infinite capacity, this assumption is often a reasonable approximation when the capacity is so large that it is rarely reached, simplifying analysis by avoiding the complexities of customer rejection or balking.
The system capacity influences the probability of customers being lost due to congestion. In a finite capacity system, the arrival rate effectively decreases as the system approaches its limit because customers are turned away.
E: Calling Population (N)
The fifth component, ‘E’, specifies the size of the calling population, which is the total number of potential customers who can request service.
- N (Finite Population): When ‘N’ is a finite integer, it indicates that the number of potential customers is limited. In such a system, as customers enter the queue and are served, they are temporarily removed from the calling population, which affects the rate of future arrivals. Once a customer completes service, they typically return to the calling population. This is common in scenarios like machine repair problems (e.g., a fixed number of machines that can break down and require repair) or a small pool of employees sharing a common resource.
- ∞ (Infinite Population): This is the default assumption if ‘E’ is omitted. It implies that the number of potential customers is so large that the arrival rate is not significantly affected by the number of customers currently in the system. Most general queuing problems, such as customers arriving at a retail store or calls to a large call center, assume an infinite calling population because the fraction of the population in the system at any given time is negligible.
The distinction between finite and infinite calling populations is crucial for accurately modeling arrival rates. In a finite population, the arrival rate is dependent on the number of customers currently outside the system; as more customers are in the queue or being served, fewer are available to generate new arrivals.
F: Service Discipline (Queue Discipline)
The sixth and final component, ‘F’, describes the rule by which customers are selected from the queue for service. This determines the order in which waiting customers are served.
- FIFO (First-In, First-Out) / FCFS (First-Come, First-Served): This is the most common and often assumed discipline if ‘F’ is omitted from the notation. Customers are served in the strict order of their arrival. This is the standard in most real-world waiting lines (e.g., supermarket queues, bank queues). It is generally considered the fairest discipline, as it prevents customers who arrive later from being served before those who arrived earlier.
- LIFO (Last-In, First-Out): Customers who arrive most recently are served first. This is less common for human queues but can occur in scenarios like inventory management (e.g., “stacking” items where the last item placed on top is the first one retrieved).
- SJF (Shortest Job First): Customers with the shortest estimated service times are selected first, regardless of their arrival time. This discipline aims to minimize the average waiting time in the system and maximize throughput. It requires prior knowledge or estimation of service times and can lead to “starvation” for customers with very long service requirements.
- SRPT (Shortest Remaining Processing Time): A pre-emptive version of SJF, where if a new customer arrives with a shorter remaining service time than the customer currently being served, the current service is interrupted, and the new customer is served. This is common in computer operating systems scheduling.
- PRI (Priority): Customers are assigned priorities (e.g., high, medium, low), and higher-priority customers are served before lower-priority ones, regardless of their arrival time. This can be non-preemptive (higher priority waits for current service to finish) or preemptive (higher priority interrupts current service). Examples include emergency services in hospitals, VIP lines at airports, or urgent tasks in a production system.
- RR (Round Robin): Commonly used in computer time-sharing systems, where each customer (process) is given a small quantum of service time. If service is not completed within that time, the customer returns to the end of the queue, and the next customer is served. This provides fair sharing of resources and gives the illusion of simultaneous processing.
- RSS (Random Selection for Service): Customers are chosen randomly from the queue for service. This is rarely intentional in human queues but can arise in highly disorganized systems or when customers are indistinguishable.
The service discipline significantly impacts individual customer waiting times and the perceived fairness of the system, even if it might not always affect the overall average system performance metrics like throughput in steady-state under certain conditions.
Standard Notations and Applications
While the full A/B/C/D/E/F notation provides comprehensive detail, it is common practice to omit the last three parameters (D, E, F) if they assume their default values: infinite capacity (D=∞), infinite calling population (E=∞), and First-Come, First-Served discipline (F=FCFS). Thus, the notation M/M/1 implicitly means M/M/1/∞/∞/FCFS. This simplified notation is frequently encountered in textbooks and academic discussions.
- M/M/1: This is the most fundamental and widely analyzed queuing model. It describes a system with Poisson arrivals, exponential service times, and a single server. It serves as a benchmark for understanding basic queuing phenomena.
- M/M/c: This extends the M/M/1 model to multiple parallel servers, still with Poisson arrivals and exponential service times. It’s crucial for modeling scenarios like call centers, where multiple agents handle customer inquiries.
- M/G/1: A system with Poisson arrivals, general service times, and a single server. This model is more complex than M/M/1 but allows for more realistic service time distributions. The Pollaczek-Khinchine formula is often used for its analysis.
- G/G/1: The most general single-server model, where both arrival and service times can follow any distribution. This often requires simulation for analysis due to its complexity.
Kendall’s Notation is indispensable for applying the vast array of mathematical models developed within queuing theory. Each combination of components (e.g., M/M/1, M/M/c, M/G/1) has specific formulas and analytical techniques associated with it for calculating key performance metrics:
- Average Queue Length (Lq): The average number of customers waiting in the queue.
- Average Number of Customers in the System (Ls): The average number of customers in the queue plus those being served.
- Average Waiting Time in Queue (Wq): The average time a customer spends waiting before service begins.
- Average Waiting Time in System (Ws): The average total time a customer spends in the system (waiting plus service).
- Server Utilization (ρ): The proportion of time servers are busy.
- Probability of an Empty System (P0): The probability that there are no customers in the system.
These metrics are vital for operational management, capacity planning, resource allocation, and process improvement across numerous industries. For example, in a hospital emergency room, knowing the expected waiting time (Wq) for an M/M/c system can inform staffing levels. In a manufacturing plant, understanding the impact of machine breakdowns (arrivals) and repair times (service) on production buffers (queue capacity) can be critical.
Limitations and Advanced Considerations
Despite its widespread utility, Kendall’s Notation has inherent limitations. It is a simplification and does not capture all the nuances of real-world queuing systems.
- Steady-State Assumption: Most analytical models based on Kendall’s Notation assume a steady-state condition, meaning the system has been operating for a long time and its statistical properties are stable. It does not easily describe transient behavior, such as the initial build-up of a queue or sudden changes in arrival rates.
- Complex Customer Behavior: The notation does not directly account for complex customer behaviors like balking (leaving without joining the queue because it’s too long), reneging (joining the queue but leaving before being served due to impatience), or jockeying (moving between parallel queues). While these can sometimes be integrated into more advanced models, they are not explicit in the notation itself.
- Multi-Stage Queues: Many real-world systems involve multiple service stages in sequence (e.g., registration -> doctor -> pharmacy). Kendall’s Notation describes a single service node; analyzing a network of queues requires more sophisticated techniques.
- Server Breakdowns and Vacations: The notation assumes servers are always available and never break down or go on vacation.
- Cost Analysis: While queuing theory helps quantify performance, Kendall’s Notation does not inherently include cost factors associated with waiting customers or idle servers, which are crucial for economic optimization.
For scenarios that go beyond the capabilities of standard Kendall’s Notation, operations researchers often employ advanced techniques such as queuing networks (e.g., Jackson networks), simulation modeling, or specialized analytical methods that incorporate the complexities missed by the notation. However, even in these advanced applications, Kendall’s Notation often serves as the starting point for defining the basic characteristics of each individual node within a larger system.
Kendall’s Notation stands as a cornerstone of queuing theory, providing an elegant and powerful framework for categorizing and analyzing the ubiquitous phenomenon of waiting lines. Its simplicity and universality have enabled generations of analysts and practitioners to model, understand, and optimize systems where demand for service exceeds immediate capacity. By clearly defining the probability distributions of arrivals and service, the number of servers, system capacity, calling population, and service discipline, it allows for a structured approach to problem-solving.
This standardized language facilitates effective communication among professionals and underpins the application of various mathematical models to predict system performance. From designing efficient call centers and optimizing manufacturing flow to managing patient queues in healthcare and streamlining traffic, the principles embedded within Kendall’s Notation are fundamental to ensuring operational efficiency and customer satisfaction. Its continued relevance underscores its profound impact on the field of operations research and its practical utility across diverse sectors.
While acknowledging its limitations in capturing every minute detail of highly complex or dynamic systems, Kendall’s Notation remains an indispensable tool. It serves as a crucial starting point for conceptualizing queuing problems and often forms the basis for more sophisticated simulations and analyses. As businesses and organizations increasingly rely on data-driven decision-making, the foundational understanding provided by Kendall’s Notation will continue to be vital in shaping strategies that balance resource utilization with service quality in an ever-interconnected world.