HackMD
    • Sharing Link copied
    • /edit
    • View mode
      • Edit mode
      • View mode
      • Book mode
      • Slide mode
      Edit mode View mode Book mode Slide mode
    • Note Permission
    • Read
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • Write
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • More (Comment, Invitee)
    • Publishing
    • Commenting Enable
      Disabled Forbidden Owners Signed-in users Everyone
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Invitee
    • No invitee
    • Options
    • Versions
    • Transfer ownership
    • Delete this note
    • Template
    • Save as template
    • Insert from template
    • Export
    • Google Drive Export to Google Drive
    • Gist
    • Import
    • Google Drive Import from Google Drive
    • Gist
    • Clipboard
    • Download
    • Markdown
    • HTML
    • Raw HTML
Menu Sharing Help
Menu
Options
Versions Transfer ownership Delete this note
Export
Google Drive Export to Google Drive Gist
Import
Google Drive Import from Google Drive Gist Clipboard
Download
Markdown HTML Raw HTML
Back
Sharing
Sharing Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
More (Comment, Invitee)
Publishing
More (Comment, Invitee)
Commenting Enable
Disabled Forbidden Owners Signed-in users Everyone
Permission
Owners
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Invitee
No invitee
   owned this note    owned this note      
Published Linked with
Like BookmarkBookmarked
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
--- tags: disinfo --- # 零時資訊傳播資料標準 0archive Data Standard ## Language The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). ## Glossary * Producer * Publication * Author * Dataset compiler > All "identifier" fields in the followings may be a "localized" identifier for the entity issued by the publisher of the data set. They can be anonymized through reindexing or hashing algorithms. Such a localized identifier should be prefixed with an identifier of the publisher in an [org-id](http://org-id.guide/)-like practice. ## Classes and properties ### Producer The Producer class should have the following properties: * **id**: a unique identifier within the scope of this dataset for the content producer. * **name**: a name of the company, organization, discussion board, social network profile, or other kinds of information channel that produces contents. * **alternate names**: shorter names, abbreviates, names of the producer in various languages of the producer that are in common use. * **identifiers**: ways to identify the producer, such as its legal entity if there is one, its tracking IDs used in services such as Google Analytics, or others. * **description**: a short description of the content producer given by the dataset compiler. * **classification**: the classification of the content producer given by the dataset compiler. * **url**: an URL to the web site, online news outlet, discussion board, social network profile, where the content made by the producer is published. * **languages**: languages primarily used to produce the contents. * **licenses**: licenses primarily used on the contents. * **date of first seen**: the publishing date and time of the earliest content by the producer included in the dataset. * **date of last update**: the publishing date and time of the newest content by the producer included in the dataset. * **followership**: number of subscribers, followers, users who "like" the channel, or other information about the followership of the producer. #### JSON Serialization * The name `other_names` is used instead of alternate names, whose value is a list of name strings. * The value of `identifiers` property is a list of identifier objects. An identifier object has `scheme` and `identifier` propreties. The dataset compiler is responsible for specifying the identifier objects it uses in the dataset. * The name `first_seen_at` is used instead of date of first seen. * The name `last_update_at` is used instead of date of last update. * The value of `followership` property is a followership object. The dataset compiler is responsible for specifying the format of followership objects it uses in the dataset. #### JSON Schema TODO #### Refereneces * Ronny's newsdiff [data](https://github.com/ronnywang/newsdiff/blob/master/webdata/stdlibs/url-normalizer.js/map.csv) * Gugod's people-in-news [news-sites.txt](https://github.com/g0v/people-in-news/blob/master/etc/news-sites.txt) * 零時檔案局 [Sites](https://airtable.com/shrd0utGHlTWmQsYt) * Schema.org: [Organization](https://schema.org/Organization), [Person](https://schema.org/Person), [Project](https://schema.org/Project) * "Account information" in [Twitter datasets](https://transparency.twitter.com/en/information-operations.html) * Popolo standard: [Organization](https://www.popoloproject.com/specs/organization.html) * Open Contracting Data Standard [Organization](https://standard.open-contracting.org/latest/en/schema/reference/#organization) * * "[Followership and social media marketing](https://www.researchgate.net/publication/282685486_Followership_and_social_media_marketing)", "[Social Media Followership as a Predictor of News Website Traffic](https://www.tandfonline.com/doi/abs/10.1080/17512786.2019.1635040?journalCode=rjop20)" ### Publication The Publication class should have the following properties: * **id**: a unique identifier within the scope of this dataset for the publication. * **version**: version of this copy of the publication. * **identifiers**: ways to identify the publication, such as ID used by popular content databases. * **producer_id**: identifier of the producer that made the publication. * **canonical_url**: an URL to the publication, normalized to be a unique identifier within the scope of this dataset. * **title**: title of the publication. * **text**: text of the publication. * **author**: author of the content of the publication. * **about**: the subject this publication is replying to, such as on Twitter. * **language**: language used by text of the publication. * **license**: license of the publication. * **date of publication**: the date and time when the content was published. * **date of first seen**: the date and time when the publication was first observed by the dataset compiler. * **date of last update**: the date and time when the latest update to the content was observed by the dataset compiler. * **urls**: URLs, excluding the ones pointing to the publication itself, appeared in the text of the publication. * **hashtags**: hashtags, as used in Twitter tweets and Facebook posts, used in the content. * **mentions**: mentions to other users, as used in Twitter tweets and Facebook posts, made in the content. * **keywords**: keywords of the content as given by the producer in meta-tags or in the publication. * **reactions**: numbers of reader reactions, such as Facebook "like" or other emotions, Twitter "like", Medium "claps", or user views. * **comments**: comments to the publication, such as PTT and Facebook replies. * **internet connection**: information about the connection used to publish this content, such as IP address on PTT articles, or geolocation of Twitter tweets. #### JSON Serialization * The value of `identifiers` property is a list of identifier objects. An identifier object has `scheme` and `identifier` propreties. The dataset compiler is responsible for specifying the identifier objects it uses in the dataset. * The name `publication_text` is used instead of text. * The name `reply_to` is used instead of about. * The name `published_at` is used instead of date of publication. * The name `first_seen_at` is used instead of date of first seen. * The name `last_update_at` is used instead of date of last update. * The value of `reactions` property is a reactions object. The dataset compiler is responsible for specifying the format of reactions objects it uses in the dataset. * The value of `comments` property is a list of comment objects. * The name `connect_from` is used instead of internet connection. #### JSON Schema TODO #### References * Schema.org: [Article](https://schema.org/Article), [WebPage](https://schema.org/WebPage), [Message](https://schema.org/Message), [SocialMediaPosting](https://schema.org/SocialMediaPosting) * "articles" in [cofacts opendata](https://github.com/cofacts/opendata) * "Tweet information" in [Twitter datasets](https://transparency.twitter.com/en/information-operations.html) * Popolo standard [Speech](https://www.popoloproject.com/specs/speech.html), [Motion](https://www.popoloproject.com/specs/motion.html) * Gugod's people-in-news [Article.pm](https://github.com/g0v/people-in-news/blob/master/lib/Sn/Article.pm), [NewsExtractor::Article](https://github.com/perltaiwan/NewsExtractor/blob/master/lib/NewsExtractor/Article.pm) ### Comment The Comment class should have the following properties: * **id**: a unique identifier within the scope of all comments about the same subject. * **author**: author of the comment. * **text**: text of the comment. * **date of publication**: the date and time when the comment was published. * **about**: the subject this comment is in reply to. * **reactions**: numbers of reader reactions, such as Facebook “like” or other emotions, upvotes or downvotes. * **internet connection**: information about the connection used to publish this content, such as IP address on PTT comments. #### JSON Serialization * The name `comment_text` is used instead of text. * The name `published_at` is used instead of date of publication. * The name `reply_to` is used instead of about. * The value of `reply_to` property, if the subject of this comment is another comment, is the `id` of the other comment. If the subject of this comment is the publication it belongs to, there need not be a `reply_to` value. * The value of `reactions` property is a reactions object. The dataset compiler is responsible for specifying the format of reactions objects it uses in the dataset. * The name `connect_from` is used instead of internet connection. #### JSON Schema TODO ## Specifications by 0archive ### Publication * `version` is UNIX timestamp of the time when the copy of the publictaion was archived. ## Serialization and packaging The following serialization scheme is supported: * JSON Lines Datasets are packaged in the following format: * Frictionless Data [Data Package](https://frictionlessdata.io/specs/data-package/) ## Similar projects * [The OpenArchive Project](https://open-archive.org/) * [Civil Archive](https://grants.g0v.tw/projects/586a7be0a327a4001ee49126) in g0v grants * [WARC](https://en.wikipedia.org/wiki/Web_ARChive) # Work in progress The followings are pending features to the standard that may or may not going to be included in the final draft. ## Classes and properties ### Media * identifier * canonical_url * author * title * description * location * content_type * content_length * content * content_hash * original_filename * first_seen_at * last_updated_at * tags #### References * [Open Archive "SAVE Space" Specification](https://github.com/OpenArchive/Save-app-android/blob/master/docs/OpenArchiveSpaceCapsuleSpec.md) * Schema.org [MediaObject](https://schema.org/MediaObject) ### Claim #### References * Schema.org [Claim](https://schema.org/Claim) ### ClaimReview * summary * canonical_url * item_reviewed * review_body * review_rating * references #### References * Schema.org [ClaimReview](https://schema.org/ClaimReview)

Import from clipboard

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lost their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.


Upgrade

All
  • All
  • Team
No template.

Create a template


Upgrade

Delete template

Do you really want to delete this template?

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

Sign in via GitHub

New to HackMD? Sign up

Help

  • English
  • 中文
  • 日本語

Documents

Tutorials

Book Mode Tutorial

Slide Example

YAML Metadata

Resources

Releases

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions

Versions

Upgrade now

Version named by    

More Less
  • Edit
  • Delete

Note content is identical to the latest version.
Compare with
    Choose a version
    No search result
    Version not found

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.