In this demo, we're going to take a look at how you can query sheets data in Google Drive directly with BigQuery. All the demos again are inside of our public course repository under data engineering and demos. Actually we're going to be adding and probably change the names of these, so just refer to this page and you'll see all the ones that we have. Where to find the simple external dataquery. So here's a little bit of information about where you can actually query, the query has got the two types of stores; the native storage, and then there's federated queries where it can query the data but it doesn't live inside of BigQuery from all these different sources. This demo is going to focus on Google Drive. So first thing we need to do is we need to create our data source. So we're going to navigate to sheets.new. Boom, we got a new spreadsheet. Next part is we're just going to copy in some data. Anything you want as long as it starts within cell A1, and one of the really cool things about Google Sheets is you can actually invoke a machine learning API by putting in a cell equals Google translate. What do you want to translate? What content? What's the source language? That's English. What language is it going to? That's Spanish. That's a two-letter code for it, and then boom, you get a translation for it too. So we can say like, BigQuery is awesome and the weather is terrible, wherever you want. If you wanted to look at some of the other language codes, let's go ahead and pull up some of the codes, two other codes and see what we got, Indo-European. Let's get some special characters in there, and that's to German. So it's not supported. What about Chinese? Chinese, or I can look it up. C-H, that's what I want. Boom, all right. Now let's go ahead and check how we get this data live into BigQuery and we're going to make changes here, we're going to see those live inside of BigQuery as well. So what you do is copy the URL of the sheet, and I've got that copied on my clipboard, and I'm going to navigate back to Google Cloud Platform inside of BigQuery. We're going to create a new dataset. We are going to call it e-commerce. Dataset doesn't already exists. We're going to create a dataset and it is a project about product translations or something like that. Default settings are fine. We've got a new dataset. Dataset is just a container that holds those tables views and Machine Learning models. It's empty right now. That's why I didn't draw out it into anything. We're going to create a brand new table, and instead of having to be an empty table, the source is going to be Drive, which is where Google Sheets lives. I'm going to paste in my drive URL, we specify that it is a Google Sheet. I don't want to worry about the range if I wanted to do just a subset of that data that's fine. It's an external table. There you go. That's exactly what we're going to be demoing and the table numbers is going for translation. See what project titles, descriptions, you just want to get into different languages really quickly. You need to provide a schema. It's going to say schema input. You don't want it to auto detect it. We provide the schema for you in the demo, scroll down, boom. You just literally paste it in there or edit it with the text UI. Scrolling down to Advanced Options we do have a header row. So we're going to write one for a header row to skip. Create the table. Now we have the table. Clicking on the table you get a couple of free options query table. Gets you your table name, boom. I set free. It's just mainly you can get the syntax for free. BigQuery will charge you for the bytes of data that you process. Of course there is the free tier up to a certain amount of terabytes for free per month of data processed. This definitely won't even be near that limit. Select what columns, and if you didn't want to install that's a bad practice. You can actually go into the schema and click the columns which is kind of cool. Always could have it more for my query. Since it's an external query, doesn't actually know how much data it's going to process ahead of time because it's not stored natively. But regardless, when a robot run that federally queries take a little bit longer because authenticate that connection out there but boom, there is our data directly from Drive. Again a very small amount of data, 477 bytes. So say we wanted to just add in a bunch of other different languages. Let's see; Korean, French. Let's see here. Maybe I'm sure my lack of language knowledge goes. Let's see what we got. Welcome to experiment with this. There we go. Russian? Then RU, area or a couple of different ones. Of course you see BigQuery support for those special characters as well. Then you already have the connection setup. So if you rerun the same query, do you think it will show this data or the new data? If you get new data, you're absolutely correct. Boom, there you go. So advantage, if you have somebody maintaining stuff and sheets, again you could have issues of maintaining your data in a spreadsheet because other people can edit any of you the author permissioning access that but maybe that's a part of your ETL process. Someone's maintaining parameters in a sheet you need to bring it in there, you can live do that and then say create or replace table or something like that from here. Negative things about external data connections, since the data is not natively stored in BigQuery you don't get the benefits of BigQuery native Storage which is, every time you run this query you don't get the advantage of cache. Even if it's the same data, same query, it'll never be cached because it's external. Caching is a metadata feature of native BigQuery storage. Again your query can be a little bit slower because it has to reach out and authenticate that connection to Drive. All right. That's it. Go ahead and experiment planner with different messages. Again you can get that code for the Google Translate API directly in there as well. Good luck. Have fun.