new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Oct 29

CSnake: Detecting Self-Sustaining Cascading Failure via Causal Stitching of Fault Propagations

Recent studies have revealed that self-sustaining cascading failures in distributed systems frequently lead to widespread outages, which are challenging to contain and recover from. Existing failure detection techniques struggle to expose such failures prior to deployment, as they typically require a complex combination of specific conditions to be triggered. This challenge stems from the inherent nature of cascading failures, as they typically involve a sequence of fault propagations, each activated by distinct conditions. This paper presents CSnake, a fault injection framework to expose self-sustaining cascading failures in distributed systems. CSnake uses the novel idea of causal stitching, which causally links multiple single-fault injections in different tests to simulate complex fault propagation chains. To identify these chains, CSnake designs a counterfactual causality analysis of fault propagations - fault causality analysis (FCA): FCA compares the execution trace of a fault injection run with its corresponding profile run (i.e., same test w/o the injection) and identifies any additional faults triggered, which are considered to have a causal relationship with the injected fault. To address the large search space of fault and workload combinations, CSnake employs a three-phase allocation protocol of test budget that prioritizes faults with unique and diverse causal consequences, increasing the likelihood of uncovering conditional fault propagations. Furthermore, to avoid incorrectly connecting fault propagations from workloads with incompatible conditions, CSnake performs a local compatibility check that approximately checks the compatibility of the path constraints associated with connected fault propagations with low overhead. CSnake detected 15 bugs that cause self-sustaining cascading failures in five systems, five of which have been confirmed with two fixed.

  • 3 authors
·
Sep 30

Outdoor-to-Indoor 28 GHz Wireless Measurements in Manhattan: Path Loss, Environmental Effects, and 90% Coverage

Outdoor-to-indoor (OtI) signal propagation further challenges the already tight link budgets at millimeter-wave (mmWave). To gain insight into OtI mmWave scenarios at 28 GHz, we conducted an extensive measurement campaign consisting of over 2,200 link measurements. In total, 43 OtI scenarios were measured in West Harlem, New York City, covering seven highly diverse buildings. The measured OtI path gain can vary by up to 40 dB for a given link distance, and the empirical path gain model for all data shows an average of 30 dB excess loss over free space at distances beyond 50 m, with an RMS fitting error of 11.7 dB. The type of glass is found to be the single dominant feature for OtI loss, with 20 dB observed difference between empirical path gain models for scenarios with low-loss and high-loss glass. The presence of scaffolding, tree foliage, or elevated subway tracks, as well as difference in floor height are each found to have an impact between 5-10 dB. We show that for urban buildings with high-loss glass, OtI coverage can support 500 Mbps for 90% of indoor user equipment (UEs) with a base station (BS) antenna placed up to 49 m away. For buildings with low-loss glass, such as our case study covering multiple classrooms of a public school, data rates over 2.5/1.2 Gbps are possible from a BS 68/175 m away from the school building, when a line-of-sight path is available. We expect these results to be useful for the deployment of mmWave networks in dense urban environments as well as the development of relevant scheduling and beam management algorithms.

  • 15 authors
·
May 19, 2022

Random Spatial Networks: Small Worlds without Clustering, Traveling Waves, and Hop-and-Spread Disease Dynamics

Random network models play a prominent role in modeling, analyzing and understanding complex phenomena on real-life networks. However, a key property of networks is often neglected: many real-world networks exhibit spatial structure, the tendency of a node to select neighbors with a probability depending on physical distance. Here, we introduce a class of random spatial networks (RSNs) which generalizes many existing random network models but adds spatial structure. In these networks, nodes are placed randomly in space and joined in edges with a probability depending on their distance and their individual expected degrees, in a manner that crucially remains analytically tractable. We use this network class to propose a new generalization of small-world networks, where the average shortest path lengths in the graph are small, as in classical Watts-Strogatz small-world networks, but with close spatial proximity of nodes that are neighbors in the network playing the role of large clustering. Small-world effects are demonstrated on these spatial small-world networks without clustering. We are able to derive partial integro-differential equations governing susceptible-infectious-recovered disease spreading through an RSN, and we demonstrate the existence of traveling wave solutions. If the distance kernel governing edge placement decays slower than exponential, the population-scale dynamics are dominated by long-range hops followed by local spread of traveling waves. This provides a theoretical modeling framework for recent observations of how epidemics like Ebola evolve in modern connected societies, with long-range connections seeding new focal points from which the epidemic locally spreads in a wavelike manner.

  • 4 authors
·
Feb 4, 2017