References¶

ZenSVI is a collection of tools developed by many researchers. The following references are provided to give credit to the original authors of the tools and datasets used in ZenSVI. Please cite them when using ZenSVI in your research.

Main Classes and Functions¶

Download Module¶

GSDownloader: Downloader for Global Streetscape dataset [5]
MLYDownloader: Downloader for Mapillary dataset [3]
KVDownloader: Downloader for Karta View dataset [1]
AMSDownloader: Downloader for Amsterdam Street View Images (SVI) [2]

Metadata Module¶

MLYMetadata: Class for processing Mapillary metadata [3]

Computer Vision (CV) Module¶

Segmenter: Class for semantic/panoptic segmentation [4]
ClassifierPlaces365: Classifier based on the Places365 model [11]
ClassifierGlare: Classifier for detecting glare in images [5]
ClassifierLighting: Classifier for determining lighting conditions [5]
ClassifierPanorama: Classifier for identifying panoramic images [5]
ClassifierPlatform: Classifier for determining the capture platform [5]
ClassifierQuality: Classifier for assessing image quality [5]
ClassifierReflection: Classifier for detecting reflections in images [5]
ClassifierViewDirection: Classifier for determining view direction [5]
ClassifierWeather: Classifier for identifying weather conditions [5]
ClassifierPerception: Classifier for perception-based image analysis [5] [6]
DepthEstimator: Class for depth estimation in images [7] [8] [10]
Embeddings: Class for generating image embeddings [9]

Bibliography¶

[1]

KartaView Documentation. URL: https://doc.kartaview.org/ (visited on 2024-10-14).

[2]

Amsterdam/panorama. Gemeente Amsterdam, June 2024. URL: https://github.com/Amsterdam/panorama.

[3] (1,2)

Mapillary/mapillary-python-sdk. September 2024. URL: https://github.com/mapillary/mapillary-python-sdk (visited on 2024-10-14).

[4]

Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, and Rohit Girdhar. Masked-attention Mask Transformer for Universal Image Segmentation. June 2022. arXiv:2112.01527, doi:10.48550/arXiv.2112.01527.

[5] (1,2,3,4,5,6,7,8,9,10)

Yujun Hou, Matias Quintana, Maxim Khomiakov, Winston Yap, Jiani Ouyang, Koichi Ito, Zeyu Wang, Tianhong Zhao, and Filip Biljecki. Global Streetscapes — A comprehensive dataset of 10 million street-level images across 688 cities for urban science and analytics. ISPRS Journal of Photogrammetry and Remote Sensing, 215:216–238, September 2024. doi:10.1016/j.isprsjprs.2024.06.023.

[6]

Xiucheng Liang, Jiat Hwee Chang, Song Gao, Tianhong Zhao, and Filip Biljecki. Evaluating human perception of building exteriors using street view imagery. Building and Environment, 263:111875, September 2024. doi:10.1016/j.buildenv.2024.111875.

[7]

René Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vision transformers for dense prediction. ArXiv preprint, 2021. doi:10.48550/arXiv.2103.13413.

[8]

René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020. doi:10.48550/arXiv.1907.01341.

[9]

Christian Safka. Christiansafka/img2vec. September 2024. URL: https://github.com/christiansafka/img2vec (visited on 2024-10-14).

[10]

Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. Depth anything: unleashing the power of large-scale unlabeled data. In CVPR. 2024. doi:10.48550/arXiv.2401.10891.

[11]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: a 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. doi:10.1109/TPAMI.2017.2723009.