Do you remember the scene in the movie, Bone Collector, wher the police officer places a dollar bill next to a footprint before she took pictue of it? That move did wonders for the police officer because that easily gave investigators clue as to the shoes size and shoe brand etc leading up to the suspect. The Dollar Bill served as a Visual Scale to the image.
That is how I feel also whenever I see a 3D rendered image of a building or a facade where I really couldn’t figure out how big is the thing unless there is some sort of Visual Scale like trees cars or people. Similar to the sets of rendered image below: