回覆列表
  • 1 # 食用技巧

    什麼是大資料

    先聽聽行家的說法:

    大資料就是多,就是多。原來的裝置存不下、算不動。

    ————啪菠蘿·畢加索

    大資料,不是隨機樣本,而是所有資料;不是精確性,而是混雜性;不是因果關係,而是相關關係。

    _______Schönberger

    移步ted:Kenneth Cukier: Big data is better data

    America"s favorite pie is?

    Audience: Apple. Kenneth

    Cukier: Apple. Of course it is. How do we know it? Because of data. You look at supermarket sales. You look at supermarket sales of 30-centimeter pies that are frozen, and apple wins, no contest. The majority of the sales are apple. But then supermarkets started selling smaller, 11-centimeter pies, and suddenly, apple fell to fourth or fifth place. Why? What happened? Okay, think about it. When you buy a 30-centimeter pie, the whole family has to agree, and apple is everyone"s second favorite. (Laughter) But when you buy an individual 11-centimeter pie, you can buy the one that you want. You can get your first choice. You have more data. You can see something that you couldn"t see when you only had smaller amounts of it.

    拿到小尺寸派的資料以後你更發現,其實蘋果派只能排到第四,第五位的樣子了。

    你有了更多資料,你就能看到之前你看不到的資訊是真的是一個ai。

    大資料最核心的價值是什麼?

    大資料是大資料的採集

    大資料行業,本身是依託於資料來源存在的服務性行業。

    大資料最根本之處在於資訊收集方式出現了重大變化與革新。大資料的出現與大量資訊直接在網路呈現關係非常緊密。

    微博、天貓、淘寶、微信等等都直接產生了大量包括定位、訊息記錄、消費記錄、評價、閱讀等等殊為龐大的資訊,可以說網際網路企業都自然的帶有資料企業的標籤。不過如果我們從資料的源頭看的更仔細一些,還是會發現,其實很多資料依然是有巨大的採集與歸類的需求。

    Joel Selanikio:Transcript of "The big-data revolution in healthcare"

    There"s a concept that people talk about nowadays called "big data." And what they"re talking about is all of the information that we"re generating through our interaction with and over the Internet, everything from Facebook and Twitter to music downloads, movies, streaming, all this kind of stuff, the live streaming of TED. And the folks who work with big data, for them, they talk about that their biggest problem is we have so much information. The biggest problem is: how do we organize all that information?

    現在人人都說大資料,但其實大家說的是 facebook,twitter,streaming 等等站點上每天產生的資訊,做大資料的人呢,會覺得我們有的資料量實在太大了。

    (組織資訊仍然是最難的問題)

    I can tell you that, working in global health, that is not our biggest problem. Because for us, even though the light is better on the Internet, the data that would help us solve the problems we"re trying to solve is not actually present on the Internet. So we don"t know, for example, how many people right now are being affected by disasters or by conflict situations. We don"t know for, really, basically, any of the clinicsin the developing world, which ones have medicines and which ones don"t. We have no idea of what the supply chain is for those clinics. We don"t know -- and this is really amazing to me -- we don"t know how many children were born -- or how many children there are -- in Bolivia or Botswana or Bhutan. We don"t know how many kids died last week in any of those countries. We don"t know the needs of the elderly, the mentally ill. For all of these different critically important problems or critically important areas that we want to solve problems in, we basically know nothing at all.

    許多有效的資料還完全不在網路上,要依靠原始的方法來收集。資料方面還有很多基本層面的問題在非常多的領域非常明顯。

  • 中秋節和大豐收的關聯?
  • 如何應對家庭冷暴力?冷暴力所導致的抑鬱症該如何自救?