博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
json数据转换成表格_电子表格会让您失望吗? 将行数据转换为JSON树很容易。
阅读量:2524 次
发布时间:2019-05-11

本文共 11569 字,大约阅读时间需要 38 分钟。

json数据转换成表格

Like many of you, I often have to take the result of and convert the rowsets to . Sometimes I have to do the same with CSV files from spreadsheets. The transformation process can be a hassle, though anyone can do it. Yet, it can be time-consuming and error-prone. This post will show you how to use the Node.js package to simplify the process in very few lines of code.

像你们中的许多人一样,我经常不得不获取的结果并将行集转换为 。 有时,我必须对电子表格中的CSV文件执行相同的操作。 转换过程可能很麻烦,尽管任何人都可以做到。 但是,这可能既耗时又容易出错。 这篇文章将向您展示如何使用 Node.js包仅需几行代码即可简化该过程。

Before going further, I’ll first need a dataset to base some examples on. The domain will be Books, which lend themselves to all sorts of categorization. I will use a fake data generator called , which I previously used for mocks in my post on.

在继续之前,我首先需要一个数据集以一些示例为基础。 领域将是Books ,这使它们可以进行各种分类。 我将使用一个称为的伪造数据生成器,我先前在帖子中使用过该 。

The book data will be of the following structure:

图书数据将具有以下结构:

casual.define('book', () => {    const author = casual.random_element(authors);    const book = {        first_name: author.first,        last_name: author.last,        title: casual.random_element(author.titles),        category: casual.random_element(author.category)    }return book;});

Every time I request a casual.book I get a book with a new set of values. It’s not entirely random. The generator uses some for well-known authors, and more-or-less randomly generated data for other authors. Here’s a sample:

每次我请求casual.book我都会得到一本casual.book新值的书。 这不是完全随机的。 生成器为知名作者使用一些 ,为其他作者使用或多或少随机生成的数据。 这是一个示例:

{ dataset:   [ { first_name: 'Barbara',       last_name: 'Cartland',       title: 'The Pirate and the Piano Teacher',       category: 'thriller' },     { first_name: 'Carlie',       last_name: 'Haley',       title: 'Digitized Global Orchestration',       category: 'engineering' },     { first_name: 'Arthur',       last_name: 'Doyle',       title: 'The Case of the Spotted Dick',       category: 'mystery' },     { first_name: 'Reinhold',       last_name: 'Gutmann',       title: 'Managed Directional Benchmark',       category: 'management' },     { first_name: 'Isaac',       last_name: 'Asimov',       title: 'Once in a Venusian Sun',       category: 'science fiction' },     { first_name: 'R. L.',       last_name: 'Stein',       title: 'Why are You Scared of Me?',       category: 'childrens books' },     { first_name: 'Alicia',       last_name: 'Cruickshank',       title: 'Balanced Local Database',       category: 'engineering' },     { first_name: 'Chase',       last_name: 'Runte',       title: 'Ergonomic Tertiary Solution',       category: 'engineering' } ] }

If you’re interested in how this data was generated, the full source code used in this post can be found . For a little bit of added realism, this generated data will be thrown into an for later retrieval. Here’s the format of the results for the SQL query:

如果您对如何生成这些数据感兴趣,可以在找到使用的完整源代码。 为了增加一点真实感,此生成的数据将被扔到以便以后进行检索。 这是SQL查询结果的格式:

SELECT title, category, first_name, last_nameFROM bookJOIN author ON author.id = book.author

This format is, for all intents and purposes, identical to the format of the dataset shown just previously, for example:

出于所有目的和目的,此格式与之前显示的数据集的格式相同,例如:

[ { title: 'Proactive Regional Forecast',    category: 'mystery',    first_name: 'Arthur',    last_name: 'Doyle' },  { title: 'More Scary Stuff',    category: 'suspense',    first_name: 'Steven',    last_name: 'King' },  { title: 'Scary Stuff',    category: 'occult',    first_name: 'Steven',    last_name: 'King' },  { title: 'Persistent Neutral Info Mediaries',    category: 'management',    first_name: 'Maegan',    last_name: 'Frami' },  { title: 'Enhanced Background Frame',    category: 'engineering',    first_name: 'Winifred',    last_name: 'Turner' },...

The main difference between the dataset and the rowset is that when populating the database from the casual-generated data, I eliminated duplicate authors (by name) and book titles (by category):

数据集和行集之间的主要区别在于,从临时生成的数据填充数据库时,我消除了重复的作者(按名称)和书名(按类别):

转换为JSON (Converting to JSON)

You might notice that the dataset results were in JSON format already. What this post aims for, though, is to build a containment hierarchy that shows the relationships between authors, books, and categories in a concise way. That’s not the case with the rowset values, where the results are glorified key-value pairs, where each pair is a column name and value from a table row.

您可能会注意到数据集结果已经是JSON格式。 但是,这篇文章的目的是建立一个包含层次结构,以简洁的方式显示作者,书籍和类别之间的关系。 行集值不是这种情况,其中的结果是美化的键值对,其中每个对都是列名和表行中的值。

So, for example, say I want to list authors, the categories they write in, and the titles of books in those categories that they authored. I want to show each category just once, and each book within each category should be listed only once, also.

因此,例如,说我想列出作者,他们所写的类别以及他们所创作的那些类别中的书名。 我只想显示每个类别一次,并且每个类别中的每一本书也应该只列出一次。

This is a pretty common type of reducing operation that is often applied to rowset data. One way to conquer the problem is to declare a container object, then populate it by looping through the rowsets. A typical implementation might be:

这是一种非常常见的归约操作类型,通常应用于行集数据。 解决问题的一种方法是声明一个容器对象,然后通过遍历行集来填充它。 典型的实现可能是:

The handrolled()method gets a bit hairy the deeper the hierarchy. Local variables are used to reduce long path lengths. We have to keep the meta-structure in mind to write the proper initializations of properties in the JSON object. What could be simpler?

handrolled()方法在层次结构越深时变​​得有些毛茸茸。 局部变量用于减少长路径长度。 我们必须牢记元结构,以便在JSON对象中编写属性的正确初始化。 有什么可能更简单?

The results returned are:

返回的结果是:

...        "Doyle,Arthur": {            "categories": {                "thriller": {                    "titles": [                        "The Case of the Spotted Dick",                        "The Case of the Mashed Potato"                    ]                },                "mystery": {                    "titles": [                        "The Case of the Spotted Dick"                    ]                }            }        },        "Asimov,Isaac": {            "categories": {                "science": {                    "titles": [                        "Once in a Venusian Sun",                        "Total Multi Tasking Forecast"                    ]                },                "general interest": {                    "titles": [                        "Total Multi Tasking Forecast",                        "Once in a Venusian Sun",                        "Fourth Foundation"                    ]                }            }        },        "Kilback,Bradley": {            "categories": {                "management": {                    "titles": [                        "Mandatory Solution Oriented Leverage"                    ]                },                "engineering": {                    "titles": [                        "Multi Layered Fresh Thinking Framework",                        "Total Scalable Neural Net",                        "Mandatory Solution Oriented Leverage"                    ]                },                "reference": {                    "titles": [                        "Multi Layered Fresh Thinking Framework"                    ]                }            }        },...

用Treeize构建一棵树 (Building a tree with Treeize)

The npm module treeize is designed to simplify the conversion of rowsets to structured JSON data through the use of descriptive keys. Installation through npm is per usual:

npm模块树化 旨在通过使用描述性键简化行集到结构化JSON数据的转换。 通常通过npm安装:

npm install --save treeize

JSON行集 (JSON Rowsets)

Treeize is able to recognize reoccurring patterns in the rowsets. It transforms them according to how the key names are defined in metadata passed in as the seed structure. Here’s the code:

Treeize能够识别行集中的重复模式。 它根据如何在作为种子结构传入的元数据中定义关键字名称来对其进行转换。 这是代码:

This is about a dozen lines of code compared to double that for the hand-rolled version. Notice the key values used in the mapping operation. Treeize recognizes plurals as collections, so categoriesand titleswill be arrays. The colons (‘:’) in the names indicate nesting. Typewill be a property of an object in the array of categories, and namewill be a property in all objects in titles.

与手动版本相比,这大约是十几行代码。 请注意映射操作中使用的键值。 Treeize将复数形式识别为集合,因此categoriestitles将是数组。 名称中的冒号(':')表示嵌套。 Type将是类别数组中对象的属性, name将是标题中所有对象的属性。

The tree is built when authors.grow(seed) is called, and the results retrieved through authors.getData(). However, it doesn’t quite yield the same results as what we had from the hand-rolled method:

该树是在调用authors.grow(seed)并通过authors.getData()检索结果时构建的。 然而,它并不完全产生相同的结果,我们从手卷方法有:

...,{    "name": "Glover, Ashley",    "categories": [        {            "type": "engineering",            "titles": [                {                    "name": "Intuitive Full Range Capacity"                },                {                    "name": "Organic Encompassing Core"                }            ]        },        {            "type": "reference",            "titles": [                {                    "name": "Distributed Client Server Service Desk"                },                {                    "name": "Organic Encompassing Core"                }            ]        },        {            "type": "management",            "titles": [                {                    "name": "Organic Encompassing Core"                }            ]        }    ]},...

One notable difference is that categories are not named objects (as before), but objects with a name property. Title is also not just an array of strings, but an array of objects with nameas the title. Treeize interprets categories and titles as arrays of objects, not as maps (or arrays of primitives). For most use cases, this is not much of an issue. But, if you need to find a category by name quickly (rather than iterate through an array of categories), then you can take care of that to arrive at the same structure as before:

一个显着的区别是类别不是像以前一样命名的对象,而是具有name属性的对象。 Title不仅是一个字符串数组,而且是一个以name为标题的对象数组。 Treeize将categoriestitles解释为对象数组,而不是地图(或基元数组)。 对于大多数用例来说,这不是什么大问题。 但是,如果您需要按名称快速查找类别(而不是遍历一系列类别),则可以来达到与以前相同的结构:

,...       "Doyle, Arthur": {        "categories": {            "mystery": {                "titles": [                    "The Case of the Spotted Dick",                    "Pre Emptive Needs Based Approach",                    "The Case of the Mashed Potato"                ]            },            "thriller": {                "titles": [                    "The Case of the Mashed Potato",                    "The Pound Puppies of the Baskervilles"                ]            }        }    },...

试算表 (Spreadsheets)

Sometimes data comes from spreadsheets rather than relational databases. Treeize is adept at handling this case, too. Instead of using descriptive keys as we did with rowset data in JSON format, the same descriptive format is used as column values in a header row:

有时数据来自电子表格,而不是关系数据库。 Treeize也擅长处理这种情况。 与使用JSON格式的行集数据一样,不使用描述性键,而是将相同的描述性格式用作标题行中的列值:

var seed = [['name', 'categories:type', 'categories:titles:name'], ['Doyle, Arthur', 'mystery', 'The Adventure of the Gyring Gerbils'],['Schuppe, Katarina', 'engineering', 'Configurable Discrete Locks'],['Doyle, Arthur', 'mystery', 'Holmes Alone 2'],['Asimov, Isaac', 'science fiction', 'A Crack in the Foundation']];// same as before...var authors = new Treeize();authors.grow(seed);return authors.getData();

There are that treeize supports, and I’ve only shown the basics. It is a powerful tool that makes light work of transforming row-based data structures.

treeize支持的有很多,而我仅展示了基础知识。 它是一个强大的工具,可以轻松地转换基于行的数据结构。

Complete source .

完整的源代码 。

翻译自:

json数据转换成表格

转载地址:http://ttrwd.baihongyu.com/

你可能感兴趣的文章
python学习笔记-day10-01-【 类的扩展: 重写父类,新式类与经典的区别】
查看>>
查看端口被占用情况
查看>>
浅谈css(块级元素、行级元素、盒子模型)
查看>>
Ubuntu菜鸟入门(五)—— 一些编程相关工具
查看>>
PHP开源搜索引擎
查看>>
12-FileZilla-响应:550 Permission denied
查看>>
ASP.NET MVC 3 扩展生成 HTML 的 Input 元素
查看>>
LeetCode 234. Palindrome Linked List
查看>>
编译HBase1.0.0-cdh5.4.2版本
查看>>
结构体指针
查看>>
迭代器
查看>>
Food HDU - 4292 (结点容量 拆点) Dinic
查看>>
Ubuntu安装Sun JDK及如何设置默认java JDK
查看>>
[经典算法] 排列组合-N元素集合的M元素子集
查看>>
Codeforces 279D The Minimum Number of Variables 状压dp
查看>>
打分排序系统漫谈2 - 点赞量?点赞率?! 置信区间!
查看>>
valgrind检测linux程序内存泄露
查看>>
MSP430(F149)学习笔记——红外接收
查看>>
cef3的各个接口你知道几个
查看>>
Hadoop以及组件介绍
查看>>