Accurately Tracking Smartphones Indoors
By Ramsey Faragher and Robert Harle
If we wish to obtain consistently usable positions indoors using a mobile phone, we can augment its GPS or GNSS receiver with other unfettered sensing technologies such as gyroscopes and accelerometers supplemented by radio signals of opportunity. But is all of this actually feasible? The authors have conducted tests of a multi-system approach to positioning indoors with favorable results.
IS GPS REALY A GLOBAL POSITIONING SYSTEM? Well, that depends on your definition of “global.” If it means that GPS operates well all over the world in environments where it was designed to work, then, yes, it is a global system. But, if you define global as meaning that GPS operates well everywhere not only outdoors with a clear view of the sky but also indoors and in other restricted environments, then (as some have argued), GPS is not truly global.
So why doesn’t GPS work (for the most part) indoors? Our mobile phones do and they use similar bits of the electromagnetic spectrum. The basic problem is that the signals from GPS (and other GNSS) satellites are just too weak to easily penetrate buildings. They are more than strong enough to yield excellent positioning, navigation, and timing (or PNT) results if the antenna connected to the receiver can “see” the satellites unobstructed. But even outdoors, trees, buildings, and mountains can block the signals from one or more satellites at a time. And indoors, the signals are usually attenuated by walls, floors, and ceilings so much that a conventional receiver cannot lock onto them.
Receiver manufacturers have developed more sensitive receivers that can operate, at least to some degree, indoors but with a good antenna. And receiver chips or modules with this more sensitive technology are often found in modern mobile phones. But they don’t typically provide reliable indoor positioning because they are being used with inexpensive, suboptimal antennas. Some potential improvement in indoor positioning capability is possible by supplying the receiver with satellite orbit and clock information through the mobile network rather than having the receiver acquire this information directly from the satellite signals. This assisted-GNSS technique allows a receiver to work with weaker signals. But it is not a panacea. Gaps or holes still exist for positioning indoors or in other obstructed environments, prompting one industry wag to liken GNSS coverage to Swiss cheese.
So, what are we to do if we wish to obtain consistently usable positions indoors using a mobile phone? As we will see in this month’s column, we can augment or bypass its GPS or GNSS receiver with other unfettered sensing technologies such as gyroscopes and accelerometers. These devices can be made very small using microelectromechanical technology and are already included in some mobile phones.
However, there are some issues with these devices for positioning, not the least of which is rapid position drift. We can restrain the drift by using magnetometers, for example – also present in some mobile phones. We can also use radio signals of opportunity to help in the positioning – signals available in the phone such as multi-generation mobile signals, Bluetooth, and Wi-Fi through their signal strength “fingerprints.” But is all of this actually feasible?
The authors of the article in this month’s column have conducted tests of such a multi-system approach to positioning indoors with quite favorable results. Are we at the stage of accurate positioning (and tracking) everywhere? Not quite, but we are getting closer.
“Innovation” is a regular feature that discusses advances in GPS technology and its applications as well as the fundamentals of GPS positioning. The column is coordinated by Richard Langley of the Department of Geodesy and Geomatics Engineering, University of New Brunswick. He welcomes comments and topic ideas.
In recent years, there has been increasing interest in ubiquitous positioning — accurate location fixes in any environment, outdoors and indoors. We have all become used to the availability and performance of global navigation satellite systems (GNSS) for accurate outdoor radio positioning with a reasonable degree of reliability and availability. However, indoor radio positioning is more challenging because GNSS signals do not penetrate buildings well, and we must instead rely on local infrastructure and other available inputs to aid the user.
Indoor radio positioning is, however, available to the general public today through the use of signal strength fingerprint databases managed and provided by third-party providers such as Skyhook. These typically use Wi-Fi and cellular signals because of their ubiquity and the prevalence of appropriate receiver circuits in consumer devices. The user can also access the fingerprint database through these media. These systems, therefore, have two clear constraints: the database must have been previously built via some form of survey process, and the user must have a data connection available to obtain it. A more scalable system would not rely on such constraints, and would instead develop its own database during operation.
The benefits of such a system are significant: it can provide location-based services, situational awareness, and asset tracking in new and unknown environments for consumers, emergency services, the military, lone workers, security personnel, and autonomous vehicles. There is no requirement for a data link to function, nor any prior surveying of the radio environment, nor any other prior knowledge such as a floor plan or map. However, the system can also be used to quickly and easily generate maps of the radio environment or floor plans, which can be beneficial for organizations wishing to provide positioning services to the public using a simpler positioning method; that is, this method can be used to rapidly survey an area and generate a signal fingerprint database for other users to exploit. Best of all, all of this can be achieved today in real time using an app for a consumer smartphone.
The Digital Swiss Army Knife
The last couple of decades have seen steady improvements in a variety of sectors that have led to new and flexible navigation capabilities — and all of these improvements can now be found in the little chunks of silicon, plastic, and glass in our pockets and handbags. Moore’s Law and the miniaturization of electronics have enabled us all to carry handheld programmable supercomputers around with us every day. Microelectromechanical systems technologies and the demand for better gaming and augmented reality experiences on our smartphones mean that any new phone contains the same types of sensors for enhancing user experiences that cruise missiles and smartbombs use to ensure they hit their targets precisely.
Finally, your smartphone contains more radios than you probably realize. GPS (or GNSS); 2G, 3G, and 4G network radios; near field communications, like RFID; Bluetooth; Wi-Fi; and even a VHF FM chip might be tucked away in there somewhere. The near future is likely to bring a “whitespace” radio (using re-assigned vacated spectrum) along with a 60-GHz wireless USB transceiver. We are bathed in a phenomenal number of radio signals as we go about our daily lives, completely oblivious to the rich tapestry we are walking through — an invisible, permanent, detailed map just waiting to be sensed by our smartphones and annotated for our navigation purposes.
So, just what is possible with a commodity smartphone and its arsenal of features?
Pedestrian Motion Modeling
We can begin with the accelerometers, magnetometers, gyroscopes and barometers found in recent smartphones. These sensors collectively form an inertial measurement unit (IMU) that can be used to track the motion of a user through any environment, regardless of the availability of GNSS (at least in theory).
Unfortunately, there are many stumbling blocks in the way for any new navigator starting down this road. The standard approach for inertial navigation involves using the gyroscopes to maintain an estimate of the orientation of the device relative to the Earth, and to integrate the accelerometer measurements to calculate the system velocity and subsequently the change in position with each measurement update. A key aspect of this process is the removal of the effect of gravity, which requires us to estimate the value of the local gravity field strength (which varies with location across the globe) and its direction (which we do based on the estimated orientation of the device according to the gyroscopes). There are inevitably some errors associated with the estimates of both of these quantities.
In addition, the sensors themselves suffer noise, biases, instabilities, non-linearities, and other effects that only decrease the system performance further. These errors accumulate over time because the position and orientation estimates at any moment depend on the cumulative sum of all measurements since the start of the journey. The result is rapid and unbounded growth in position and orientation error. The cost of the sensors is, of course, tightly correlated with their quality, and so the rate at which the navigation performance degrades. The quality of the sensors in smartphones is so low that this approach is rendered useless within the first few seconds of use. To make progress we must apply regular position corrections to the system by applying external constraints or incorporating external sensor measurements.
Alternative. GNSS measurements provide constraints and corrections for inertial navigation systems, but here we are considering operating indoors where these are unavailable or severely degraded. An alternative solution for most smartphone users is to use the inertial sensors in a different manner, within a so-called pedestrian dead- reckoning (PDR) approach. Here, it is assumed that the device being tracked is held by (or attached to) someone walking in a manner that can be modeled. The inertial sensors are not now used to reproduce the full 3D motion of the device at the update rate of the sensors, but instead used simply to detect stepping motions and to infer that the user has moved some number of steps. Looking for patterns in the accelerometer data where minimum and maximum thresholds are exceeded within a certain time window is a surprisingly robust step counter when the user walks “normally” (more complicated actions such as side steps and stumbles require more complex algorithms). The smartphone can estimate its orientation by fusing together its gyroscope (which offers good short-term orientation-tracking) and its magnetic compass (good long-term orientation-tracking with periodic fluctuations from local magnetic anomalies). The step length of the user (a surprisingly consistent quantity) and any bias in the gyro-smoothed compass heading can both be measured and modeled during periods of GNSS availability such that the best possible estimates are available when GNSS is lost.
FIGURE 1 shows the functional flow diagrams for a strapdown inertial navigation system (top) and a PDR system (bottom). Note that the PDR scheme accumulates error more slowly than the INS scheme (involves fewer integrations over lower-rate data) but is heavily dependent on the performance of the gait recognition, floor-change detection, and step-length-estimation algorithms.
However, PDR techniques still accumulate error, resulting in gradual position drift, but with much higher performance than would be achieved by integrating the raw data in the traditional INS manner. Typical PDR schemes can track the user with an accuracy of a few percent of the distance walked, although this performance degrades with any un-modeled motions that confuse the step detector, such as infrequent backward or sidesteps. So how do we deal with this issue?
The accuracy of PDR schemes is dependent on the validity of the pedestrian motion model. Any un-modeled action has the potential to generate false positive events in the step detector and hence contribute to position error. Users may stoop, crawl, jump, hop, or shake their device while static — motions that are all very difficult to unambiguously discriminate in raw sensor data.
There are many approaches to solving this problem of gait recognition, and most exploit machine learning techniques. The basic principle of supervised machine learning is that a large set of labeled training data (that is, lots of manually categorized data of each type) is analyzed by a computer in order to extract patterns, statistics, or certain measurement sequences from the inertial sensor measurements that reveal the type of step that was taken. In unsupervised learning, the clusters and categories within the data must be found by the algorithms themselves.
The outputs from such algorithms are typically thresholds, signatures, and other learned metrics that can be installed in a smartphone and used to dynamically classify movements. It is also possible to deploy the learning algorithms on the device itself so that it can learn what the particular user’s signatures are to permit better step and gait detection (like training a speech-recognition program to understand your accent). A simple example of this is running an error-state Kalman filter while GNSS signals are available to determine the user step length and to detect any background compass bias that is corrupting the system.
A problem yet to be resolved for PDR schemes is a basic physical one: the laws of physics are the same for an object at rest as for one moving at constant speed. This means that it is theoretically possible for a suitably skilled person to simulate the “already moving at constant velocity” version of any of these motions while static by moving the device in just the right manner, effectively spoofing as many steps or motions as they like. The opening and closing phases of a journey (that is, the very first and last steps) are critical in distinguishing real and spoofed motion if only inertial sensing is used to disambiguate real and spoofed motion through an environment. We will, however, return to this problem in a moment.
Simultaneous Localization and Mapping
The application of machine learning can be extended to the entire indoor navigation problem using a technique called Simultaneous Localization and Mapping (SLAM). A key aspect here is the hypothesis that there are some measurements that can be taken within an indoor environment that vary rapidly on the spatial scale but only slowly on a temporal scale. These opportunistic measurements are typically of radio signal strength (Wi-Fi, cellular, television, VHF FM, and so on) and magnetic field strength, although in principle many other metrics could be used such as light level and temperature. They are deemed to be opportunistic because they already exist in the environment and have not been generated specifically for this positioning system. Moving along a corridor is expected to result in a particular sequence of measurements that is repeatable on the next visit to that corridor with a confidence based on the time since the last visit. Tight agreement is expected within the next few minutes, close agreement within the next few days, and so on. It is not expected that these fingerprints will necessarily be valid for months or years, as objects may move around the environment; for example, large items may be relocated and Wi-Fi access points may be moved. The ability to exploit the expectation of high repeatability over short time periods of a few hours is the key to developing a system that can learn about its environment and improve its performance during use.
As the device moves through the indoor environment (with position estimate driven by the PDR estimation), the opportunistic fingerprints are captured and stored. If the device returns to a region it has been in before, then it will record a sequence of measurements that will agree closely with the previous sequence that was recorded in the past. This provides a constraint to the system: whatever path was taken in between, it has converged with a section of its historical path and “closed a loop.” Any offset in these two path sections at this point reveals the inertial error that has accumulated during this loop. The system can therefore correct its own inertial error growth, allowing extended operations in GNSS-denied areas.
Fingerprint Maps. The gathered opportunistic measurements can also be used to generate fingerprint maps of the areas that can be shared with other users to allow them to accurately position themselves within those areas in the future, reducing everyone’s reliance on PDR schemes and removing the need for environments to be manually surveyed for their environmental maps. The maps are automatically calibrated and corrected by the SLAM process. As more users operate in the environment and more data accumulate it is easier to identify and remove erroneous data that does not fit into the consensus being formed by the “intelligence of crowds.” This opportunistic navigation scheme can also feed back into the PDR scheme to aid with motion detection — as fingerprints are expected to vary on a fine spatial scale as users move through an environment. They can be used to detect when a PDR device is in reality static, but being moved in a manner that is erroneously triggering the step-detection routine.
FIGURE 2 shows a plot of the magnetic-field-strength variations recorded during four walks down the same corridor of a building at four different times of day on four different days. The traces have been manually aligned by the clear drop in field strength at step number 40. A fixed step length was assumed, and the relative stretching evident across the traces is due to small differences in walking speeds across the tests. Step-length changes can be estimated using changes in the stepping frequency, and the typical step length can be observed and calibrated during periods of GNSS availability.
There are two distinct classes of SLAM algorithm for PDR. The most common class involves an iterative batch process applied after the data have been collected (that is, offline). This process (which might be least-squares fitting or maximum likelihood estimation, for example) identify loop closure points and provide an optimal joint estimation of the path taken by the user that satisfies these constraints and the raw odometry data as much as possible. The
Wi-Fi SLAM approaches, Gaussian Processes Latent Variables and GraphSLAM, both use such schemes. The results are typically robust, but the offline processing stage can be lengthy.
SLAM can, however, be performed in real time, even on a smartphone, by exploiting an efficient multi-hypothesis scheme. As the user moves, we retain multiple hypotheses for their position and, crucially, record the history of each hypothesis. This is typically done using a particle filter, where each particle represents a unique hypothesis. In this context, we must store the tree of ancestors for each particle at each epoch. When we detect a loop closure, we prune the history to remove all hypotheses that did not result in a loop closure at that point and therefore dynamically correct our errors. Note that each particle can even be assigned different parameter values, such as step length or heading bias, and if a gait detection scheme cannot confidently identify the type of step taken, new particles representing every possible user motion at that epoch can be generated.
Occupancy Grid. Rather than running a specific loop closure algorithm, an occupancy grid is used, whereby the environment is defined by a grid of small cells, for example, one meter by one meter squares. As each particle propagates, representing a hypothesis of the user path, it posts its identity and the current step number into the occupancy grid. As the user continues to move, the particles check the grid cells they move through for any previous visits. If a particle has visited a cell before, the current sensor measurements are compared to those recorded at the time of the last visit. If there is close agreement (typically scored using metrics such as the Euclidean or Mahalanobis distances) then that particular particle is given a high weight. Conversely, poor agreement results in a low weighting.
The entire particle cloud can be reweighted accordingly with low-scoring particles being killed and high-scoring particles being duplicated. The result is the particle cloud collapsing towards the region of close agreement between old and new sensor measurements. Because the occupancy grid contains the historical path of each particle stored via their IDs and step-number sequence, when a reweighting of particles occurs, the historical path of the user is updated and improved accordingly along with the current estimate of the user’s location.
The SLAM estimate can be improved by many types of observations, not just loop closures. If the user moves outside and confident GNSS locations become available, these can also be used to reweight the particle cloud. If the user moves into a region where the floor plan of the building is available to the positioning engine, particles can be pruned whenever they try to cross walls. If desired, even direct user interaction such as manually tapping the map on the smartphone display could be used to provide a position estimate and so constrain the particle cloud.
FIGURE 3 shows six stages from a walk around the corridors of a building using an indoor positioning smartphone app to track the user. The red dashed line shows the trace using just the PDR scheme, which exhibits gradual degradation in positioning accuracy. The green solid line shows the trace using SLAM to constrain the PDR error growth using magnetic anomalies and Wi-Fi signal strengths.
A further modern advance is in computer vision: the use of cameras and algorithms to monitor and interpret features in the environment. The movement of features within the field of view from frame to frame can be used to determine the motion of the camera if it is assumed that the majority of the objects tracked through the view are actually static in the environment. Consistency checks between features allow those corresponding to other moving objects to be filtered out.
The result of this visual odometry scheme is the ability to determine the speed and heading changes of the camera by observing the optical flow of the environment. As with PDR approaches, integrating over visual odometry measurements results in motion tracking with much slower reduction in accuracy over time and distance than for systems built upon traditional IMU integration (accelerometers and gyroscopes) alone. If specific objects or features can be uniquely identified and recognized when seen again in the future, then SLAM techniques can also be applied. At the moment, smartphones are powerful enough to apply computer vision techniques and calculations at moderate update rates of a few frames per second. As smartphones become more powerful, or if mobile operating systems will, in future, permit these computer vision algorithms to be deployed on the dedicated graphical processing units, or even perhaps if devices such as Google Glass result in the deployment of dedicated computer vision chips within devices, we will see computer vision coupled with augmented reality move to the forefront of smartphone navigation.
Our desire for accurate positioning and tracking anywhere will never go away. The availability of cheap, accurate GPS over the last decade has resulted in accurate positioning, navigation, and timing not only being something we take for granted, but something society has come to depend upon. The positioning capabilities of our smartphones will continue to improve, not only because of the new developments and capabilities described above, but because of new infrastructure developments.
The In-Location Alliance is a large consortium of companies including big names like Nokia and CSR who are defining standards for Bluetooth and other beacon-based positioning technologies for dedicated deployments in indoor environments such as shopping centers, airports, and museums. The new 4G LTE signal structure also contains a dedicated ranging signal to permit traditional timing-based positioning schemes to be easily deployed using these new cellular standards. All infrastructure-based schemes incur costs associated with deployment and maintenance that ultimately limit their scope of deployment; opportunistic schemes are the key to truly ubiquitous positioning.
While billions of dollars are being spent worldwide on deploying and maintaining new GNSS, there will always be scenarios and environments where these weak signals are blocked or severely corrupted. In these cases, opportunistic sensing powered by smart algorithms running on consumer devices costing a few hundred dollars will be there to fill those gaps.
Ramsey Faragher is a senior research associate at the University of Cambridge and an associate editor for the journal of the Royal Institute of Navigation. Previously he was a principal scientist at the BAE Systems Advanced Technology Centre, near Chelmsford in the United Kingdom, where he developed the NAVSOP GNSS-denied positioning system. His research interests include opportunistic positioning, sensor fusion, and machine learning.
Robert Harle is a senior lecturer at the University of Cambridge with research interests in positioning, sensor fusion, and wireless sensor networks. He has worked on indoor positioning since 2000, developing a series of infrastructure-based and infrastructure-free solutions.
• Simultaneous Localization and Mapping
“SmartSLAM – An Efficient Smartphone Indoor Positioning System Exploiting Machine Learning and Opportunistic Sensing” by R.M. Faragher and R.K. Harle in Proceedings of ION GNSS+ 2013, the 26th International Technical Meeting of the Satellite Division of The Institute of Navigation, Nashville, Tennessee, September 16–20, 2013 (in press).
“Opportunistic Radio SLAM for Indoor Navigation Using Smartphone Sensors,” by R. Faragher, C. Sarno, and M. Newman in Proceedings of PLANS 2012, Institute of Electrical and Electronics Engineers / Institute of Navigation Position, Location and Navigation Symposium, Myrtle Beach, South Carolina, April 23–26, 2012, pp. 120-128.
“Efficient, Generalized Indoor WiFi GraphSLAM” by J. Huang, D. Millman, M. Quigley, D. Stavens, S. Thrun, and A. Aggarwal in Proceedings of 2011 IEEE International Conference on Robotics and Automation, Shanghai, May 9–13, 2011, pp. 1038–1043, doi: 10.1109/ICRA.2011.5979643.
“WiFi-SLAM Using Gaussian Process Latent Variable Models” by B. Ferris, D. Fox, and N. Lawrence in Proceedings of IJCAI-07, the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6–12, 2007, R. Sangal, H. Mehta, and R. K. Bagga (Eds.), published by Morgan Kaufmann Publishers Inc., San Francisco, California, pp. 2480–2485.
“Simultaneous Map Building and Localization for an Autonomous Mobile Robot” by J.J. Leonard and H.F. Durrant-Whyte in Proceedings of IROS’91, Institute of Electrical and Electronics Engineers / Robotics Society of Japan International Workshop on Intelligence for Mechanical Systems, Osaka, Japan, November 3–5, 1991, pp. 1442–1447, doi: 10.1109/IROS.1991.174711.
• Integrated Indoor Navigation
“A Survey of Indoor Inertial Positioning Systems for Pedestrians” by R. Harle in IEEE Communications Surveys & Tutorials, Vol. 15, No. 3, pp. 1281–1293, doi: 10.1109/SURV.2012.121912.00075.
Principles of GNSS, Inertial, and Multisensor Integrated Navigation Systems, Second Edition, by P.D. Groves, published by Artech House, Boston, Massachusetts, 2013.
• Wi-Fi Positioning
“Wi-Fi Azimuth and Position Tracking Using Directional Received Signal Strength Measurements” by J. Seitz, T. Vaupel, S. Haimerl, J.G. Boronat, and J. Thielecke in Proceedings of 2012 Workshop on Sensor Data Fusion: Trends, Solutions, Applications, Bonn, September 4–6, 2012, pp. 72–77, doi: 10.1109/SDF.2012.6327911.
“Comparison of WiFi Positioning on Two Mobile Devices” by P.A. Zandbergen in Journal of Location Based Services, Vol. 6, No. 1, 2012, pp. 35–50, doi: 10.1080/17489725.2011.630038.
• Step Length and Pedestrian Navigation
“Step Length Estimation Using Handheld Inertial Sensors” by V. Renaudin, M. Susi, and G. Lachapelle in Sensors, Vol. 12, No. 7, 2012, pp. 8507–8525, doi: 10.3390/s120708507.
• Computer Vision and Navigation
“Improving the Accuracy of EKF-Based Visual-Inertial Odometry” by L. Mingyang and A.I. Mourikis in Proceedings of 2012 IEEE International Conference on Robotics and Automation, Saint Paul, Minnesota, May 14–18, 2012, pp. 828–835, doi: 10.1109/ICRA.2012.6225229.
• Machine Learning
Information Theory, Inference and Learning Algorithms by D.J.C. MacKay, published by Cambridge University Press, Cambridge, U.K., 2003.
• Mobile Phone GPS Antenna Performance
“Mobile-Phone GPS Antennas: Can They Be Better?” by T. Haddrell, M. Phocas, and N. Ricquier in GPS World, Vol. 21, No. 2, February 2010, pp. 29–35.