[MUSIC] Hello again, and welcome back. In this lesson, I'm going to continue where I left off in the last lesson, and we're going to work on processing our data and getting it loaded for our analysis. So, I suspect that we probably have a few duplicate layers here, and what I'm going to do is arrange them by their letter code here. And I'll probably drop off the N and the E here, those look like prefixes that I'll still need to go look up at some point, but it looks like the county code is this two-digit thing. So I'm going to sort by them, and that way, I can see which ones I have duplicates of. Okay, and it looks like I do, in fact, have a few duplicates. I have a duplicate with AP, I have a duplicate with MN, and a duplicate with MA, as well as potentially a duplicate here with FR between the EFR and the FR. So, what I'm going to do on the ones that don't have those prefixes right now is I'm just going to remove the one with the older year. Since that's from 2001 and that's from 2013, let's get rid of the older data. And I can do the same thing for the MA because the other one's newer, 2011 MA versus 2001 MA. And then MN and FR, I need to figure out what those prefixes are before I decide to toss one or the other out, even though it's very clear that one is newer than the other, 2009 versus 2000, and 2010 versus 2001. The rest of these look like they are the only ones we have though, so we'll keep those. Going back to the page I downloaded them from, I can also go to the metadata probably. But going back to the page I downloaded it from, it looks like NM, or MN, NMN means Northern Mono County and EFR means Eastern Fresno County. So, I might end up keeping the older data just for more complete coverage. These look like maybe they're partial updates, but I should check how they overlay each other. So I'll turn on EFR and FR, and we'll see how they work together here. Yes, so the EFR one has all this, but it doesn't have this updated. So I think I want to get the whole study area, even if my data is a little out of date because of it. But your trade offs may vary. You may decide that you'd rather have the most accurate, more modern 2009 data versus the older 2000 data, which is really ageing at this point. So, I'm going to remove EFR and NMN here. And then what's next? Well, I think what I want to do now is bring all of these into a single data set. Now, I need to figure out my work flow for whether or not I'm going to remove the stuff I don't care about now and then merge the data sets, or if I'm going to merge them and then remove the overlapping polygons afterward. I think the first thing I want to try is running a merge, even with the overlapping polygons. And then run my query and export a new data set. Because then I only have to run the query once, instead of running it on all of these layers, and then merging the results. So let's give it a try, and we'll see how it goes. So I'll go to arc toolbox. Let's pin this out here, and Merge is in Data Management Tools. And I think it's in General and then Merge. And I'm going to select all of these. Oop, I dropped a layer group, and it didn't like that, so I'm going to select all of them and bring them in here. And ARCJS figures out that the fields all match each other here. So what's nice is that it's going to bring them all together as long as they have the same name. Now, I'm not sure yet whether or not I care about the rest of this data, so I'm not going to remove it just yet. But my guess is that the only one we really need for our analysis right now is just class one and maybe the subclass that basically telling us what types of agriculture are grown in the flood plain. Or telling us what types of agriculture are grown, and then we'll figure out what's growing on the flood plain soon. Okay, so notice that it put the output data set in my normal default geodatabase. Here's where, usually for our project, I like having a specific default geodatabase. And that way, all the intermediate products are with that project, stored with relative paths, and that way, if I move this data around, my working map document also doesn't break. So let's make that now, and then, in a moment, I'll set it as my default geodatabase on this project. And I usually make a new folder called scratch for stuff like this. And that way, any working things like tins and geodatabases go in here. And then I'll call this scratch.gdb. So I'll go scratch, and then I'll call it land use merged all, and click OK. Now, while this is merging, I'll note that while we don't have to add metadata to all of these individual pieces, it can be a good thing to do because it'll help us keep track of what we did throughout. I usually add my metadata after I come up with a more final data set, like after I merge this and then after I subset it, I would add the metadata saying okay, it was derived from all this data that came from this source, and then I did this operation on it. Looks like it had trouble. It said cannot open Land Use Shapefiles/06al, okay. Well, let's try it without that particular one, and see what happens if we run it now. And it's still hunting here. And now it can't open 13AP, so I suspect that it's having a problem with all of these. I'm going to add, or we're going to remove all of these here. And let's try ungrouping this. And then bringing them in here now and see what happens. So I'm thinking it's working better now. These are the types of the things you have to debug with arc GIS. Use your intuition, think of things you're doing that aren't typical for your situation. Like the fact that I dragged the layers in from the table of contents and that they happen to be grouped might have been the problem with the Merge Tool. Or maybe sometimes tools have problems with where you're storing particular things. So if you had it as a shape file, try storing it as a file to your database instead, or vice versa. Okay, and that succeeded, and we can now look at our entire data set. So zoom to layer, and I'm going to turn off that one. And that's quite a lot here. Let's zoom into part here. And just to show you what I meant by overlapping polygons earlier, I'm assuming this extends pretty far. And if I select there, I get both, and you could see the boundaries hidden behind the other one. This polygon here is drawing on top of this polygon here, even though they're in the same data set. So overlapping polygons can be a little confusing, but let's see if we can't remove them now. So I'll go to select the attributes and make sure to select land_use_merged_all. And then I'm going to bring up our query from last time, go back to agriculture project data, then queries, and we'll copy that query. And I suspect I might have a problem with this. Think for a second about what's going on here. We'll see what happens when I run this query. It might have worked, but let's take a look at the attribute table. So in fact, in my selection, it actually looks pretty good. It's not taking care of my overlapping polygon problem still. So I might have to modify my query. The problem I thought I was going to have is that the field delimiters are different here. So class one, now that I'm storing it in a personal geodatabase instead of a shape file, sometimes you need different field data. So if I do class one here, I often don't need the same quotes around it with shape files, as with file geodatabases. But the values themselves can use the single quotes. So, I'm just going to run it again now that I've corrected it, even though it seemed like it was fine before. We'll click OK. And I still have these out here. But they're not in my selection. I kept tripping myself up with my logic, thinking that I wasn't selecting the right ones because I was trying to get rid of those, but I had a not inquiry. So, I'm selecting all the ones that aren't those and once I take my selection and make it a new layer, I should have only the ones I'm interested in. So, let's zoom in down here again to that area that we're a little more familiar with. And theoretically, we're only getting the agricultural use here. So let's make this a little simpler to view. And down here in the valley, we're getting a ton of agriculture, so that's great. Now, what I'm going to do is just for our viewing before we finalize this, I'm going to create the layer from the select features, turn off this layer here, and then I'm going to zoom out to this layer and check it out. And it's still a lot, which is good because it probably means it's selecting the ones we're interested in. But, It looks like we got some oddities over here, where we're getting things around the edges. So let's take a look at what those are. And that's class Z, so not sure why that got selected in this data set here. Let's select that in our street table and see what's going on. And I think it has a space up in front of it. So, that's our issue right now. We might need to modify our query because of that. So, let's take a look here. You're going to have all kinds of data quality issues here. So just to make this easier, I'm going to go to editor. Start editing, edit that one, and just so I can edit the attribute table here. It's going to give me a warning. I'm not editing a feature, so I'm not concerned about spatial references, and yeah, in fact, there's a space in front of it. Okay, so we need to modify our query now, due to just odd little data quality issues. So stop editing, close this out. And now, and it's nice, I mean, we should have done data checking anyway. But it's nice that we had something so glaringly wrong that helped us find that issue. Because we probably have that issue with other things too, so we're going to go back to Select by Attributes on land_use_merge_all. And we'll switch that layer back on. And there might be a way to combine these into a like query, so that we could put a percent in front of all these to account for any spaces before or after. So let's see what happens if we do that. We'll do percent and then not like, but I don't know if that works with the inquiries either, so yeah, so it didn't really like that. So probably what I'm going to end up doing, there are a few ways you could approach this. We could do it in batches with each of these, but what I'll end up doing I think, do a light query with each of these. But what I'll do instead is just do each of these variants, so space NR, NR space, Space, NR, space. And hope that that would get them. But hm, now I'm going to change my mind. I'm going to do this in batches with each of these attributes because then I can do a light query and just pull out the individual ones and just stack it on top. So I'll select all. And then I'll do a remove from current selection for each of these. And actually, changing my mind even again, I can't select all. So I'm going to do, I'll create a new selection. And I'll add to the current selection. And then when I'm done, I'll switch it, so. We're going to add the current selection, and then CLASS1 is LIKE '%NR%', and let's see what we get. Apply. Great, so we got some. And now, let's look at our queries to see which values we were looking at. NV and NW are next, so I'll do NV. And it's still Add to Current Selection, so expanding that selection. And those got picked up at least so that's good. Let's see what happens if I do that Z. Just make sure that we're getting what we want. Great, all of those bad ones in here that had the space in front got picked up that time. So I'll go back into NW. While it's running, let's check our queries list for others. So I could probably even do it without the Ws and everything because if it starts with an N, it's natural. So let's just try our %U%, and that should pick up all of those. Save us a little time. And while this is happening, I want to point out that some of you might be thinking of better ways to do this, and that's great. I'm not purporting that this is the best way to do this, just that all these lectures are about going through the geoprocessing workflow and the analysis workflow. And if you can come up with ways that work better for you or tools that speak to you, do it. And I encourage you to comment on the questions for this project with other ways that you can think of to handle this, ways that you think are either just an alternative, or ways that might be better. Or ways that might even be worse, but are an alternative way to implement it in the event that for whatever reason, the way we're working on doesn't happen. It doesn't work out because there's a bug, or you run into a problem that you can't make it through. Okay, so we got all the urban, and now let's do the E zone. And that should be it. because I already did the Z. Okay, and now that it's done I'm going to go to Selection, End Switch Selection, and it's warning me because I have so many records undoing a Switch Selection, and I'm going to say yes. And this basically inverts the query for me because I had some selected and that Switch Selection command says, okay, take everything that was selected and make it unselected and take everything that wasn't selected and make it selected. So I basically just sort of negated my query, where I selected all the things I didn't want, and then I switched my selection. Okay, now let's do again what we did before. We're going to go to that selection, create layer from selected features. And I'm going to remove my old one, so I don't get confused. Turn that off. And now, I have something that looks a little more like what I'm looking for, I think. Okay, so it's done drawing. I still have some gaps in the central valley here, but for now, we're going to ignore those. We'll come, and we'll clip it to our study area soon, but for now, we have more or less what we want here. So lets take this layer here. And export it as our kind of working copy. And I forgot to set my default to database again, but that's okay. So, I'm moving to that database now, and actually, I'm not going to put this in scratch. I'm going to put this in a new geodatabase, where I keep processed data. And you can call it whatever you want, but I'll call it just data, and so I have my raw folder, which is all the unprocessed stuff, and then data is like, that's the stuff that I want to throw into a future analysis. I've already done my pre-processing on all my work that's not really the analytical side of things, but that I have to do just on the setup side. And I'll put it in here, and I'll call it land use. And that's it, okay. Then it exports the features for me. And I always like to add it to my map, just to make sure that I didn't do anything stupid on the export, like have something small selected. And then have it not give me the full export. But it's looking good. It looks like I didn't have anything selected, and it's rendering all this just fine. Okay, and the next thing I'll need to do that I won't do on video because it's much simpler than the land use data is I'm going to go through a similar process for the flood plain data. I'm going to load up the flood plain data. I'll subset it to anything that I need to and then load it into my geodatabase. Again, I'm not going to show that on video because that could get very tedious for you to watch me load and process additional data. But it's going to be a very similar workflow to add it to my map. Validate that I don't want all of it. And then select the features I want somehow and then just throw out a layer. Okay, that's it for this pre-processing phase for now. What I want you thinking about for next week is how we're then going to take a layer that has all the floodplains in it, just assume that that's what we're going to have because I'm going to build it, and then, a layer that has all of the land use data in it, and turn it into our analysis. We're going to answer our question of, what agriculture is grown in flood plains, and that might be easy for you now. It's a pretty simple analysis at this point. The data setup's probably a little longer, but start thinking about that for next week. Okay, see you then.