How Does Google Maps Get Street Views?

The Technology Behind Google Street Views

How Does Google Maps Get Street Views?

Google's Street View service uses a $100,000 Dodeca 2360 camera developed by an outside contractor, Immersive Media. This digital video camera has 11 lenses mounted to a geodesic globe that sits on top of Google's fleet of photo vehicles. The camera simultaneously captures high resolution images in every direction to create a 360-degree view of its surroundings. Its built-in GPS processor records geographical coordinates and matches them to the video. The digital video and GPS files are then stored on hard drives for processing by Immersive Media software. A gyroscopic image stabilization system ensures that footage is properly oriented on hills and bumpy roads, and the camera's high frame rate prevents video pixellation due to vehicle motion.

Taking the Show on the Road

Google's team of Street View technicians are a hardworking bunch. They collectively spend 46 hours a day driving through neighborhoods to capture the video and GPS data. Each Street View vehicle is manned by a two-person crew that uses Immersive Media's collection plan to create a data capture shot list for their area. While these teams are on the road, they give progress reports to a project manager every day. They also run through and edit captured footage to ensure the accuracy of their shot list before sending it to a post-production coordinator. Every city has major routes and landmarks and these locations are Immersive Media's top collection targets.

Processing the Results

Generating images from the raw video captures and stitching them together to create Street View maps is a complex task. To convert motion video footage to usable still images, a vertical slice of the changing view directly in front of each camera must be regularly cropped from the moving image plane. When these vertical slices, from multiple perspectives, are stitched together in a linear sequence, a panoramic Street View is created. Google uses a pose optimizer library to reconstruct camera data using input from the camera's GPS, accelerometers and rate gyroscope sensors. These sensors convert the captured real-world coordinates to Cartesian map coordinates that the software can understand. This is how a 360 degree world view is processed into a rectangular format for Internet viewing.