I have a few questions related to data science at social media companies or any other companies.
Structured data is usually in the form of CRM reports and other forms. How does a data scientist tap into unstructured user or location data? Create an in-house API that allows the data scientist to access any data at any time for example server time stamps of chat sessions? Does the data scientist usually create such API? If so, how would the API be built for a social app? A data scientist usually isn't fluent in backend languages right? So would it be the responsibility of other engineers to create an API etc for the data scientist to access server data? APIs accessing server data for social apps would usually be written in what languages? Node and Express? I know many data scientists are fluent in javascript in addition to Java, Python, R etc.
Example case. Instagram data scientists would always have access to ANY server data in order for them to be able to do their job and describe and predict. Right or wrong? Would Instagram server data be sent to a data warehouse using Hadoop or something? How would the data scientist or data engineer have access to all the server/user/app data above and beyond the structured data they receive from in-house departments?
Structured data is usually in the form of CRM reports and other forms. How does a data scientist tap into unstructured user or location data? Create an in-house API that allows the data scientist to access any data at any time for example server time stamps of chat sessions? Does the data scientist usually create such API? If so, how would the API be built for a social app? A data scientist usually isn't fluent in backend languages right? So would it be the responsibility of other engineers to create an API etc for the data scientist to access server data? APIs accessing server data for social apps would usually be written in what languages? Node and Express? I know many data scientists are fluent in javascript in addition to Java, Python, R etc.
Example case. Instagram data scientists would always have access to ANY server data in order for them to be able to do their job and describe and predict. Right or wrong? Would Instagram server data be sent to a data warehouse using Hadoop or something? How would the data scientist or data engineer have access to all the server/user/app data above and beyond the structured data they receive from in-house departments?