recentpopularlog in

jerryking : data_collection   6

Data Challenges Are Halting AI Projects, IBM Executive Says
May 28, 2019 | WSJ | By Jared Council.

About 80% of the work with an AI project is collecting and preparing data. Some companies aren’t prepared for the cost and work associated with that going in,......“And so you run out of patience along the way, because you spend your first year just collecting and cleansing the data,”.....“And you say: ‘Hey, wait a moment, where’s the AI? I’m not getting the benefit.’ And you kind of bail on it.”....A report this month by Forrester Research Inc. found that data quality is among the biggest AI project challenges. Forrester analyst Michele Goetz said companies pursuing such projects generally lack an expert understanding of what data is needed for machine-learning models and struggle with preparing data in a way that’s beneficial to those systems.

She said producing high-quality data involves more than just reformatting or correcting errors: Data needs to be labeled to be able to provide an explanation when questions are raised about the decisions machines make.

While AI failures aren’t much talked about, Ms. Goetz said companies should be prepared for them and use them as teachable moments. “Rather than looking at it as a failure, be mindful about, ‘What did you learn from this?’”
artificial_intelligence  data_collection  data_quality  data_wrangling  IBM  IBM_Watson  teachable_moments 
may 2019 by jerryking
‘You’re Stupid If You Don’t Get Scared’: When Amazon Goes From Partner to Rival - WSJ
By Jay Greene and Laura Stevens
June 1, 2018

The data weapon
One Amazon weapon is data. In retail, Amazon gathered consumer data to learn what sold well, which helped it create its own branded goods while making tailored sales pitches with its familiar “you may also like” offer. Data helped Amazon know where to start its own delivery services to cut costs, an alternative to using United Parcel Service Inc. and FedEx Corp.

“In many ways, Amazon is nothing except a data company,” said James Thomson, a former Amazon manager who advises brands that work with the company. “And they use that data to inform all the decisions they make.”

In web services, data across the broader platform, along with customer requests, inform the company’s decisions to move into new businesses, said former Amazon executives.

That gives Amazon a valuable window into changes in how corporations in the 21st century are using cloud computing to replace their own data centers. Today’s corporations frequently want a one-stop shop for services rather than trying to stitch them together. A food-services firm, say, might want to better track data it collects from its restaurants, so it would rent computing space from Amazon and use a data service offered by a software company on Amazon’s platform to better analyze what customers order. A small business might use an Amazon partner’s online services for password and sign-on functions, along with other business-management programs.
21st._century  Amazon  AWS  brands  cloud_computing  contra-Amazon  coopetition  data  data_centers  data_collection  data_driven  delivery_services  fear  new_businesses  one-stop_shop  partnerships  platforms  private_labels  rivalries  small_business  strengths  tools  unfair_advantages 
june 2018 by jerryking
Water Data Deluge: Addressing the California Drought Requires Access to Accurate Data - The CIO Report - WSJ
April 22, 2015| WSJ | By KIM S. NASH.

California, now in its fourth year of drought, is collecting more data than ever from utilities, municipalities and other water providers about just how much water flows through their pipes....The data-collection process, built on monthly self-reporting and spreadsheets, is critical to informing such policy decisions, which affect California’s businesses and 38.8 million residents. Some say the process, with a built-in lag time of two weeks between data collection and actionable reports, could be better, allowing for more effective, fine-tuned management of water.

“More data and better data will allow for more nuanced approaches and potentially allow the water system to function more efficiently,”...“Right now, there are inefficiencies in the system and they don’t know exactly where, so they have to resort to blanket policy responses.”...the State Water Resources Control Board imports the data into a spreadsheet to tabulate and compare with prior months. Researchers then cleanse the data, find and resolve anomalies and create graphics to show what’s happened with water in the last month. The process takes about 2 weeks....accuracy is an issue in any self-reporting scenario...while data management could be improved by installing smart meters to feed information directly to the Control Board automatically... there are drawbacks to any technology. Smart meters can fail, for example. “The nice thing about spreadsheets is anyone can open it up and immediately see everything there,”
lag_time  water  California  data  spreadsheets  inefficiencies  municipalities  utilities  bureaucracies  droughts  vulnerabilities  self-reporting  decision_making  Industrial_Internet  SPOF  bottlenecks  data_management  data_quality  data_capture  data_collection 
april 2015 by jerryking
It’s a Whole New Data Game for Business - WSJ
Feb. 9, 2015 | WSJ |

opportunistic data collection is leading to entirely new kinds of data that aren’t well suited to the existing statistical and data-mining methodologies. So point number one is that you need to think hard about the questions that you have and about the way that the data were collected and build new statistical tools to answer those questions. Don’t expect the existing software package that you have is going to give you the tools you need....Point number two is having to deal with distributed data....What do you do when the data that you want to analyze are actually in different places?

There’s lots of clever solutions for doing that. But at some point, the volume of data’s going to outstrip the ability to do that. You’re forced to think about how you might, for example, reduce those data sets, so that they’re easier to move.
data  data_collection  datasets  data_mining  massive_data_sets  distributed_data  haystacks  questions  tools  unstructured_data 
february 2015 by jerryking
Value of big data depends on context
According to Hayek, it is not only localised and dispersed knowledge, but also tacit knowledge that is crucial for the functioning of the market system. Often, useful localised knowledge is tacit. By definition, tacit knowledge cannot be articulated and mechanically transferred to other individuals.[See Paul Graham on doing things that don't scale] Companies and governments have become more successful in collecting large volumes of data but it is nearly impossible to capture useful tacit knowledge by these data collection methods.

Furthermore, the value of big data is not about the volume and the amount of collected data but it depends on our ability to understand and interpret the data. As human faculties of interpretation and understanding differ greatly, the value of big data is subjective and dependent on particular context. Ironically, the skillful use of big data may require tacit knowledge.
data_collection  letters_to_the_editor  massive_data_sets  Friedrich_Hayek  tacit_data  contextual  sense-making  interpretation  tacit_knowledge  valuations  Paul_Graham  unscalability 
february 2013 by jerryking The power of managing complexity
May 11, 2009 | Globe & Mail | by PIERRE M. LAVALLÉE.

Reducing process complexity should be a company's last step. It involves looking for process improvements that add the most value and by eliminating unnecessary data collection. One of the world's largest natural resources companies found that it had no fewer than 483 process improvement projects in the works – and that only 25 would deliver a significant impact. In combination with product and organizational simplifications, the company boosted operating income by more than 20 per cent. Meantime, the same company found it could reduce its volume of reports by 40 per cent in one major business unit.
complexity  economic_downturn  Bain  data  simplicity  information_overload  process_improvements  data_collection 
may 2009 by jerryking

Copy this bookmark:

to read