Global Biodiversity Information Facility (GBIF) - Fungi|生物多样性数据集|真菌学数据集

www.gbif.org2024-10-25 收录

生物多样性

真菌学

下载链接：

https://www.gbif.org/

下载链接

链接失效反馈

资源简介：

该数据集包含了全球范围内的真菌物种记录，涵盖了真菌的分布、分类、生态信息等。数据来源于全球各地的生物多样性研究机构和自然历史博物馆，通过GBIF平台进行整合和共享。

提供机构：

www.gbif.org

AI搜集汇总

数据集介绍

构建方式

Global Biodiversity Information Facility (GBIF) - Fungi数据集的构建基于全球范围内的生物多样性信息网络，通过整合来自世界各地的标本记录、文献引用和实地调查数据，形成了一个全面且动态更新的真菌物种数据库。该数据集的构建过程严格遵循国际生物多样性信息标准，确保数据的准确性和一致性。

特点

GBIF - Fungi数据集的特点在于其广泛的地理覆盖和丰富的物种多样性。该数据集包含了数百万条真菌物种的记录，涵盖了从热带雨林到极地冻土的多种生态环境。此外，数据集还提供了详细的物种分类信息、生态位数据以及与气候变化相关的趋势分析，为全球真菌多样性的研究提供了宝贵的资源。

使用方法

GBIF - Fungi数据集的使用方法多样，适用于生态学、生物多样性保护、气候变化研究等多个领域。研究者可以通过GBIF的在线平台直接访问和下载数据，进行物种分布模型构建、生态系统服务评估等分析。此外，数据集还支持API接口，便于科研人员进行自动化数据提取和集成，从而推动跨学科的深入研究。

背景与挑战

背景概述

全球生物多样性信息机构（Global Biodiversity Information Facility, GBIF）- 真菌数据集，是由GBIF组织维护的一个全球性真菌物种信息数据库。该数据集的构建始于2001年，由全球多个研究机构和自然历史博物馆共同参与，旨在收集、整合和共享全球范围内的真菌物种记录。随着生物多样性研究的深入，真菌作为生态系统中的重要组成部分，其数据对于理解生态平衡、物种进化以及环境变化具有重要意义。GBIF-真菌数据集的建立，极大地促进了全球真菌物种的分类学研究、生态学分析以及环境保护策略的制定。

当前挑战

尽管GBIF-真菌数据集在真菌学研究中发挥了重要作用，但其构建过程中仍面临诸多挑战。首先，真菌物种的多样性和分布广泛性使得数据收集工作异常复杂，许多偏远地区的数据难以获取。其次，真菌物种的鉴定依赖于形态学和分子生物学技术，这些技术的应用和数据标准化存在较大差异，导致数据质量参差不齐。此外，数据集的更新和维护需要持续的资金和技术支持，以应对不断变化的物种分布和新的科学发现。这些挑战限制了数据集的完整性和时效性，影响了其在科学研究和实际应用中的效能。

发展历史

创建时间与更新

Global Biodiversity Information Facility (GBIF) - Fungi数据集的创建始于2001年，由全球生物多样性信息机构（GBIF）发起，旨在收集和共享全球真菌多样性的数据。该数据集自创建以来，持续进行更新，最新的数据更新至2023年，确保了数据的时效性和完整性。

重要里程碑

GBIF - Fungi数据集的重要里程碑包括2007年首次发布全球真菌物种分布图，这一成果极大地推动了真菌生态学和生物多样性研究。2012年，数据集引入了自动化数据处理和质量控制机制，显著提升了数据的可信度和可用性。2018年，GBIF与多个国际组织合作，成功整合了来自全球各地的真菌数据，使得该数据集成为全球真菌研究的重要资源。

当前发展情况

当前，GBIF - Fungi数据集已成为全球真菌学研究的核心资源，涵盖了超过150万条真菌记录，涉及全球200多个国家和地区。该数据集不仅支持基础科学研究，如物种分布和生态系统功能研究，还为环境保护、生物多样性监测和政策制定提供了关键数据支持。随着技术的进步，GBIF - Fungi数据集正逐步实现数据的可视化和交互式分析，进一步提升了其在科学研究和实际应用中的价值。

发展历程

Global Biodiversity Information Facility (GBIF) 正式成立，旨在促进全球生物多样性数据的共享与利用。
2001年
GBIF 首次发布关于真菌（Fungi）的数据集，标志着真菌类生物多样性数据开始被系统性地整合与公开。
2007年
GBIF 的真菌数据集规模显著扩大，涵盖了全球多个地区的真菌物种记录，为科学研究和生态保护提供了重要数据支持。
2012年
GBIF 发布了真菌数据集的重大更新，增加了大量新的物种记录和地理分布信息，进一步提升了数据集的完整性和实用性。
2018年
GBIF 的真菌数据集被广泛应用于全球生物多样性评估、生态系统研究和环境保护项目中，成为国际上重要的真菌数据资源。
2021年

常用场景

经典使用场景

在全球生物多样性信息设施（GBIF）中，真菌数据集（Fungi）被广泛用于生态学和生物多样性研究。研究者利用该数据集分析真菌物种的分布模式、生态位及其与环境因素的关系。通过这些分析，科学家能够揭示真菌在不同生态系统中的角色，以及它们对全球气候变化的响应。

实际应用

在实际应用中，GBIF的真菌数据集被用于农业、林业和环境保护等多个领域。例如，农业科学家利用这些数据来识别和利用有益真菌，以提高作物产量和抗病能力。林业管理者则通过分析真菌数据来评估森林健康状况，制定可持续的森林管理计划。此外，环境保护机构利用该数据集监测和评估生态系统的健康状况，为政策制定提供数据支持。

衍生相关工作

GBIF的真菌数据集催生了大量相关研究，包括真菌物种的分类学研究、生态网络分析以及全球变化对真菌多样性的影响评估。这些研究不仅深化了我们对真菌生物学的理解，还推动了跨学科的合作，如生态学与气候科学的结合。此外，该数据集还促进了数据驱动的保护策略的发展，为全球生物多样性保护提供了新的视角和方法。

以上内容由AI搜集并总结生成

用户留言

有没有相关的论文或文献参考？

这个数据集是基于什么背景创建的？

数据集的作者是谁？

能帮我联系到这个数据集的作者吗？

这个数据集如何下载？

点击留言

数据主题

具身智能

数据集 4098个

机构 8个

大模型

数据集 439个

机构 10个

无人机

数据集 37个

机构 6个

指令微调

数据集 36个

机构 6个

蛋白质结构

数据集 50个

机构 8个

空间智能

数据集 21个

机构 5个

5,000+

优质数据集

54 个

任务类型

进入经典数据集

热门数据集

Canadian Census

**Overview** The data package provides demographics for Canadian population groups according to multiple location categories: Forward Sortation Areas (FSAs), Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs), Federal Electoral Districts (FEDs), Health Regions (HRs) and provinces. **Description** The data are available through the Canadian Census and the National Household Survey (NHS), separated or combined. The main demographic indicators provided for the population groups, stratified not only by location but also for the majority by demographical and socioeconomic characteristics, are population number, females and males, usual residents and private dwellings. The primary use of the data at the Health Region level is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information to monitor, plan, implement and evaluate programs to improve the health of Canadians and the efficiency of health services. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the health region data to raise awareness about health, an issue of concern to all Canadians. The Census population counts for a particular geographic area representing the number of Canadians whose usual place of residence is in that area, regardless of where they happened to be on Census Day. Also included are any Canadians who were staying in that area on Census Day and who had no usual place of residence elsewhere in Canada, as well as those considered to be 'non-permanent residents'. National Household Survey (NHS) provides demographic data for various levels of geography, including provinces and territories, census metropolitan areas/census agglomerations, census divisions, census subdivisions, census tracts, federal electoral districts and health regions. In order to provide a comprehensive overview of an area, this product presents data from both the NHS and the Census. NHS data topics include immigration and ethnocultural diversity; aboriginal peoples; education and labor; mobility and migration; language of work; income and housing. 2011 Census data topics include population and dwelling counts; age and sex; families, households and marital status; structural type of dwelling and collectives; and language. The data are collected for private dwellings occupied by usual residents. A private dwelling is a dwelling in which a person or a group of persons permanently reside. Information for the National Household Survey does not include information for collective dwellings. Collective dwellings are dwellings used for commercial, institutional or communal purposes, such as a hotel, a hospital or a work camp. **Benefits** - Useful for canada public health stakeholders, for public health specialist or specialized public and other interested parties. for health surveillance and population health research. for monitoring, planning, implementation and evaluation of health-related programs. media agencies may use the health regions data to raise awareness about health, an issue of concern to all canadians. giving the addition of longitude and latitude in some of the datasets the data can be useful to transpose the values into geographical representations. the fields descriptions along with the dataset description are useful for the user to quickly understand the data and the dataset. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Canadian Population and Dwelling by FSA 2011](https://www.johnsnowlabs.com/marketplace/canadian-population-and-dwelling-by-fsa-2011) - This Canadian Census dataset covers data on population, total private dwellings and private dwellings occupied by usual residents by forward sortation area (FSA). It is enriched with the percentage of the population or dwellings versus the total amount as well as the geographical area, province, and latitude and longitude. The whole Canada's population is marked as 100, referring to 100% for the percentages. - [Detailed Canadian Population Statistics by CMAs and CAs 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-cmas-and-cas-2011) - This dataset covers the population statistics of Canada by Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs). It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by FED 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-fed-2011) - This dataset covers the population statistics of Canada from 2011 by Federal Electoral District of 2013 Representation Order. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Health Region 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-health-region-2011) - This dataset covers the population statistics of Canada by health region. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Province 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-province-2011) - This dataset covers the population statistics of Canada by provinces and territories. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).

Databricks 收录

olympics.csv

该数据集包含不同国家参加奥运会的奖牌榜，数据来源于维基百科的历届奥运会奖牌榜。

github 收录

TM-Senti

TM-Senti是由伦敦玛丽女王大学开发的一个大规模、远距离监督的Twitter情感数据集，包含超过1.84亿条推文，覆盖了超过七年的时间跨度。该数据集基于互联网档案馆的公开推文存档，可以完全重新构建，包括推文元数据且无缺失推文。数据集内容丰富，涵盖多种语言，主要用于情感分析和文本分类等任务。创建过程中，研究团队精心筛选了表情符号和表情，确保数据集的质量和多样性。该数据集的应用领域广泛，旨在解决社交媒体情感表达的长期变化问题，特别是在表情符号和表情使用上的趋势分析。

arXiv 收录

AISHELL/AISHELL-1

Aishell是一个开源的中文普通话语音语料库，由北京壳壳科技有限公司发布。数据集包含了来自中国不同口音地区的400人的录音，录音在安静的室内环境中使用高保真麦克风进行，并下采样至16kHz。通过专业的语音标注和严格的质量检查，手动转录的准确率超过95%。该数据集免费供学术使用，旨在为语音识别领域的新研究人员提供适量的数据。

hugging_face 收录

RAVDESS

情感语音和歌曲 (RAVDESS) 的Ryerson视听数据库包含7,356个文件 (总大小: 24.8 GB)。该数据库包含24位专业演员 (12位女性，12位男性)，以中性的北美口音发声两个词汇匹配的陈述。言语包括平静、快乐、悲伤、愤怒、恐惧、惊讶和厌恶的表情，歌曲则包含平静、快乐、悲伤、愤怒和恐惧的情绪。每个表达都是在两个情绪强度水平 (正常，强烈) 下产生的，另外还有一个中性表达。所有条件都有三种模态格式: 纯音频 (16位，48kHz .wav)，音频-视频 (720p H.264，AAC 48kHz，.mp4) 和仅视频 (无声音)。注意，Actor_18没有歌曲文件。

OpenDataLab 收录