Gazetteer of Historical Australian Places

The Gazetteer of Historical Australian Places (GHAP) is an aggregated gazetteer of Australian placenames, and humanities information related to places, with advanced search features, and the ability to create and visualise humanities maps in different ways.  

It includes a searchable index of layers contributed by humanities researchers, institutions, and the community. Launched at the end of 2020, GHAP continues to grow and improve over time. 

Search results and cultural map layers can be visualised in 3D, with timelines or as journeys and more. 

GHAP was first based on the Australian National Placenames Data (ANPS) database. We thank ANPS for gathering this data and to Greg Windsor and David Blair for helping us make it available. More recently the National Composite Gazetteer has been added. Also layers from organisations such as AusStage and AustLang provide a place based way to access their data, and make it possible to compare this with other layers. Many layers on specific research areas have also been added the humanities research community.

To find a placename in GHAP, you can conduct a simple or advanced search. 

  1. Type your search term into the search bar above the map. 
  2. Choose the parameter for your search from the dropdown menu. 
    • Contains: Show results that contain your search term.
    • Fuzzy: Show results with roughly similar spelling. The first results will be exact matches, and then places with less and less similar spelling will be listed.
    • Exact Match: Show results that match the exact term you entered, and no more.
    • anps_id: Show results with IDs from ANPS that match your search term.
  3. Use the checkboxes to choose if you want to search the Gazetteer, Layers, or both. 
  4. Select Search

Your search results will be shown. From here you can save your search, share the results, or view the results on a map. 

To apply filters and restrictions to your search, make an advanced search. 

  1. Follow steps 1-3 of the ‘Make a simple search’ procedure.
  2. Select Advanced Search. The advanced search pane will drop down.

To only search within a specific area of the map, specify a map area. There are three ways to do this:

  1. Choose a shape from the map controls.
    Screenshot of the map controls with three buttons highlighted. One shows a pentagon shape, one a square, and one a circle.
  2. Click and drag anywhere on the map to draw the shape. You can then alter the coordinates of the shape in the advanced search pane.
  1. Under Specify map area in the advanced search pane, choose a shape from the dropdown menu. 
  2. Enter coordinates for the shape. 
  3. Select Draw to draw the shape on the map. 

Under Search within a KML polygon in the advanced search pane, upload a file.

Standards note: This file must be a valid kml format and contain at least one polygon tag. 

Your search will be restricted to this area. 

Add any Filters you want to restrict your search. Hover over the filter names to learn more about them. For example, you can:

  • Tick the box to search within the ‘Description’ for the word in the search field.
  • In ‘Layers’ type in the name of a layer or layers to limit the search to only those layers.
  • You can limit searches by State, Local Government Area (LGA), Feature Type, etc.

All these filters can be added together to limit the search. Eg: you could search for places starting with ‘Coo’ within an circle you have drawn around south east Queensland, with Feature Type ‘Mountain’, etc.

You don’t necessarily need to enter a word in the in the main search field. You could simply find all records where ‘Feature Type’ is ‘Mountain’.

Note that these filters depend on the people who have contributed layers adding this information in.

Some researcher contributed layers have ‘extended data’ with different kinds of information related to that specific project. You can also search within a layer’s Extended Data using special syntax. See below under ‘Faceted Search’.

To search for a list of placenames, upload a file under Search for a list of place names

Standards note: This must be a .txt file. The placenames must be listed one per line, or separated by commas, e.g. ‘Richmond,Footscray’. 

Once you’ve set all the restrictions to your search that you want, select Search. Your search results will be shown. From here you can save your search, share the results, or view the results on a map

If you think you will be revisiting the same search in the future, you can save it for later. 

  1. Select Login at the top right and login. 
  2. Follow the instructions to make a simple search or make an advanced search.
  3. On the search results page, select Save your search

You can find all your saved searches in My searches if you are logged in. 

  1. Hover over Browse layers in the menu. 
  2. Select Layers to view layers or Multilayers to view multilayers. 
  3. Sort the list by Name, Size, Type, Content Warning, Created, or Updated by selecting the relevant heading. 

From here you can select the name of a layer or multilayer to see its details, or select View Maps to open it on a map. 

Layers are additions to GHAP that are contributed by users. They contain data about placenames, events, sites, journeys, people, and more. They can be layered on top of the information in GHAP to show more information or give more context. 

Multilayers group several layers together so their information can be viewed on the same map. They’re useful for showing overlapping or intersecting data from different layers. 

Adding layers and multilayers to GHAP:

  • Makes humanities information more discoverable 
  • Contributes to the deeper meaning of places and people’s knowledge of culture 
  • Helps create interesting and interactive map visualisations 
  • Allows you to share your maps on the web
  • Gives new insights into research through visualisation and analysis 
  1. If you don’t already have a login, select Login at the top right. 
  2. Click Register and fill in your details.
  3. You will be emailed a confirmation. Check your junk/spam/trash folders if it doesn’t arrive in a few minutes. Some organisations hold this kind of message till the next day, so you may need to be patient.
  4. Follow the instructions in the confirmation email.
  5. Contact tlcmap@newcastle.edu.au if there are any problems.
  6. Once you have logged in you can create layers and multilayers.

To create your own layer, you can just add one or two places, or upload a file of thousands of records.

  1. Select My layers. 
  2. Select Create Layer. 
  3. Fill in the form. You only need to enter a title and description. Everything else is optional and you can come back and add in more details later if you want. 
  4. Set to private if you don’t want anybody to see it until you are ready, or to urgently remove it from public view. You cannot view the map of the layer if it is set to private.
  5. Select Create Layer at the bottom. 

This might be useful if you just want to quickly add a couple of places or events. 

  1. Click Add to layer
  2. Fill in the form. 
    • The only requirements are a title and coordinates, but you can add as much information as you like.
    • You can enter coordinates manually or just click on the little map.
  3. Select Add item
  1. Click Import from file
  2. Upload a CSV, KML or GeoJSON file. A CSV file can be saved from an Excel spreadsheet. IT is for tabular information so can be used for lists of places, coordinates and any other information. KML and GeoJSON files are specifically for mapping and you can import or export them from other mapping systems.

Standards note: This must be a CSV, KML or GeoJSON file.

To upload information from a spreadsheet, save the spreadsheet as a CSV file (In Excel, go ‘Save As’ and choose the file type CSV-UTF8. CSV works but CSV-UTF8 is recommended and is necessary for symbols and non-Latin alphabet).

The CSV column names are important: 

Required

  • ‘Title’ and/or ‘Placename’: If your information is specifically to name places the ‘Placename’ will be the ‘Title’. ‘Title’ and ‘Placename’ can both be used because in many cases you want the ‘Title’ that someone sees when they click a point to be informative about more than just the place, and also want to provide the name of the place. Eg: you might want someone to see the title of an event, such as, ‘Bennelong meets Governor Phillip.’ rather than ‘Sydney Cove’.
  • Latitude: Must be in decimal format, eg: -32.929
  • Longitude: Must be in decimal format, eg: 151.772

Example templates:

Recommended

  • Date or DateStart and DateEnd: Dates can be in the format dd/mm/yyyy or yyyy-mm-dd or yyyy. You can have three digit dates. For BC dates, use a negative sign. Eg: 10,000BC would be -10000. You can use a single date, or a span within which an event happened or may have happened. 
  • Linkback: A URL linking to another site relevant to this information. This can be a very important. It can drive traffic to your project’s site, or link back to the source of the record, so that people can access it directly, get more details or see the information within a system that is specially designed for it. Eg: if your layer relates to Trove articles you can link to the Trove article. 

Example templates

Optional 

  • Type: The layer ‘type’ will be applied to every item in the layer, but you can also set it for each record. Check the drop down for what you can put in this column. 
  • State: A state or territory of Australia. 
  • LGA: local government area 
  • Feature Term: ANPS data includes hundreds of ‘feature terms’ to label things as mountains, rivers, towns, wells, trig stations, etc, that you can draw from. 
  • Source: wherever possible always credit the source of information so people can confirm and trust the information. This could be a citation or other acknowledgement. 

Example templates:

Extra

Any other columns in your spreadsheet will be handled as ‘extended data’ and include in visualisations and data exports from TLCMap, but we can’t adjust or make any assurance about how it will be displayed.

  1. Select Upload file.

If you add a list of records from a file, check that all were added. You can do this by looking at the last record in the layer’s web page and comparing it to the last record in the file you uploaded. If GHAP has failed to upload all records, jump to Troubleshooting to explore solutions.

Any placenames and place meanings that Indigenous people wish to be publicly known can be added to a layer. 

Anybody adding Indigenous places should respect the wishes of Indigenous people and observe protocols for consultation and protection of Indigenous knowledge, places, and culture. GHAP is meant for public information, so do not add places that are secret or where this could result in desecration. 

One suggestion to promote places but ensure they are accessed in the right way is to put the contact point to arrange access on the map, rather than the place itself. E.g., if there is a traditional ochre mine called ‘Red Ochre’ to which guides or tours can be arranged, add the site of the organisation arranging access, and name the place ‘Red Ochre Access Point’ or similar, as preferred. 

GHAP currently includes Indigenous placenames that are part of the Australian National Placenames Data. User contributions have commenced with some Indigenous data sets related to language and history. We are happy to hear of any other major sources of Indigenous placename data to include, or community groups or research projects that want to add layers – contact tlcmap@newcastle.edu.au

Having Indigenous presence and information in TLCMap systems has always been one of our top priorities and motivations. TLCMap projects have, from the beginning, included projects about Indigenous places involving consultation and Indigenous researchers. 

We are also working on visualisations and features for displaying Indigenous information, such as ways to structure and interact with cyclical time, and ways to visualise journeys, that could be used for traditional travel routes. This activity is driven by projects that include Indigenous and non-Indigenous people, consultatively and as researchers. TLCMap also employs Indigenous software developers.


Once you’ve created your layer, you can view it on a map or share it with others. 

You can find all your layers in My Layers if you are logged in. 

  1. Select My Multilayers. 
  2. Select Create Multilayer. 
  3. Fill in the form. You only need to enter a title and description. Everything else is optional and you can come back and add in more details later if you want. 
  4. Select Create Multilayer at the bottom.

You’ll see a page showing the details of your new multilayer. To complete the multilayer, you need to add layers to it.

  1. Select Add a Layer below the multilayer’s metadata table. 
  2. Choose whether you want to add a layer from All public layers or just My layers (i.e. layers you’ve created yourself). 
  3. Choose a layer from the dropdown menu. 
  4. Select Add. 
  5. Repeat these steps to add as many layers as you like. The multilayer will save automatically whenever you add a layer. 

One of the main advantages of GHAP is the ability to view individual search results, layers, and multilayers on a map. To view a record in a map, look for the View Maps button. You will find it at the top of the search results page, the right of browse layers and multilayers pages, and the top of individual layer or multilayer detail pages.

An orange button that says 'View Maps'.
  1. Select View Maps to see a list of visualisation options for a record. 
  2. Choose any map view option to open the record on a map in a new tab.

Map views

Each kind of map view is suitable for different purposes and projects. 

All map views include options for satellite imagery, street maps, and other common base layer options. All maps also include clickable points that display information. 

To comply with accessibility standards, the colours of the dots on a multilayer map have been selected based on web design guidelines for colour blindness. 

  • 3D Viewer: Shows dots on a 3D map. This is the most basic visualisation and good for simple maps. E.g. 1833 NSW town populations with placenames 
  • Cluster: Shows dots close to each other merged into a large dot with a number indicating how many points are in it. This map is best for large amounts of data, or large search results, or if points are very close together. E.g. 19th Century Australian Bushfire Reporting 
  • Journey Route: Shows a line between points in the order in which they were added to GHAP. This map is best for journeys or showing events that don’t have dates associated with them in a sequence. For example, if you wish to indicate a route, or give directions, these may be followed at any time. E.g. Kokoda Trail
    (Currently, the order of places can’t be changed after you upload to GHAP, and if you edit a place it may change the order. Ensure your data is in the right order before you import it.)
  • Journey Times: Shows a line between points in the order of their dates. This map is best for visualising a specific journey where each point has a date, such as a ship’s journey with a log of times and coordinates, or a band’s tour with dates in different cities. It could also be an alternative to the Timeline view, to show the order of events. E.g. The Easybeats ‘Big Show’ Tour 
  • Timeline: Shows an interactive timeline on the map so you can see where events occurred across time. You can move each end back and forward, or select a range and ‘play’ it to watch events show up and disappear in the moving time window. This map is best for collections of events with dates and locations, such as where incidents occurred across the country, or times that institutional sites started and ceased operation. It’s also an alternative for viewing journeys with dates. E.g. Australian Prisons 
  • Werekata Flight By Route: Shows a 3D map with a flying bird’s eye view moving from point to point on a journey route. This map is for animating a journey route. It enables you to better imagine how the journey may feel in person. ‘Werekata’ is Awabakal for kookaburra. E.g. Crocodile and Rainbow Serpent, Walyalup (Fremantle) 
  • Werekata Flight By Time: The same as Werekata Flight By Route, but in order of dates. E.g. Malaspina 1789 

TLCMap, and therefore GHAP, is about open data, making research public and engaging, and enabling re-use of data in further research. The open and transparent sharing of information has always been fundamental to the academic process.

Standards note: In GHAP, layers and multilayers include fields enabling you specify the licencing terms, and any cautions and acknowledgements. If you are unsure about licencing, try Creative Commons for licencing covering common scenarios. We recommend CC BY (anyone can copy and re-use with attribution) which is commonly used for Australian government data, or CC BY-NC (free for non-commercial re-use only).

Handling Indigenous information involves ethical responsibility including and beyond copyright law. Do not upload any information without permission, and in the licencing and re-use section, add terms that anyone wanting to re-use the information should also obtain permission. 

There are different ways you can share GHAP content.

To share the raw data for a search result or layer, select Download to download the data as a CSV (works with Excel), KML or GeoJSON file. You can then send this file to others. 

Simply choose ‘ROCrate’ from the list of Download options for any search result or layer.

GHAP is not a research data repository but we do make it easy to deposit your mapping research data in an official research data repository. ROCrate is an open standard format for saving research data with metadata in a way that can be accessed into the future, and which can be easily uploaded to a research data repository.

An ROCrate is simply a zip file containing the data (in this case the CSV, KML and GeoJSON exports of your search results, layer or multilayer) and a metadata description of it, all in open standard formats. You can edit the metadata using an ROCrate editor like Describo.

Search results and layers may change over time, so if you have based a paper on information and GHAP you should probably ensure you have an ROCrate of it and that this is deposited following institutional research data requirements.

Note that GHAP doesn’t mint DOIs – the repository you deposit in or your institution may mint DOIs.

Software developers can use web services API to conduct searches and access layers remotely in their code.

For those who know a little HTML, a map can be embedded in a webpage by using an iframe. 

The following code:

<iframe src="https://views.tlcmap.org/v2/journey.html?load=https%3A%2F%2Fghap.tlcmap.org%2Flayers%2F721%2Fjson%3Fline%3Dtime" width="90%" height="400"> </iframe>

will embed a map in a web page like this:

Some web publishing systems, or organisational web security policies, may have constraints against embedding content from other sites. Check that the page you aim to embed the map in will allow this. Sometimes there is a workaround,eg: WordPress may block any embed except Youtube and similar, but you can insert a custom HTML block.

Advanced

Some layers have extended data that is specifically relevant to that layer, or the project it comes from. When a layer is added, for example by uploading a CSV spreadsheet, any extra columns are added as ‘Extended Data’. It is possible to search this information within any layer for granular and nuanced queries and, because you can save searches, maps views of these various nuances. This can be a very powerful tool for mapping different facets of a layer.

  • Go to the ‘Advanced’ section of the Search page.
  • In the ‘Layers’ field enter the name of a layer with extended data.
  • In the ‘Extended Data’ field enter a search term. Special syntax is needed.

Extended Data Search Syntax

The syntax is in the format:

[attribute] [condition] [value]

You could think of the ‘attribute’ as the column name in a spreadsheet. It means something like ‘show me all the information in this layer where this is that’. Eg: if your table of convicts had a column for ‘Profession’ – “Give me everything in the convict layer where ‘Profession’ is ‘Blacksmith’.”

condition   	Type of value
textmatch	(Contains)    e.g. name textmatch 'Jo'
=       	(Exact Match) e.g.  name = 'Joanna'
<       	numeric input only.  e.g.  Cost > 500
>       	numeric input only.  e.g.  Cost < 500
before  	date format only(YYYY-MM-DD , YYYY-MM , YYYY) e.g Start before 1995-12   End before 2012-05-05
after   	date format only(YYYY-MM-DD , YYYY-MM , YYYY) e.g Start after 1995-12   End after 2012-05-05

(To handle values with spaces in them, use quote marks.)

Some examples:

  • Profession textmatch Blacksmith
  • ‘Profession’ textmatch ‘Chimney Sweep’
  • Age < 12
  • DateOfPhotograph before 1900-01-01

You can combine these with the word AND, eg:

‘Profession’ textmatch ‘Chimney Sweep’ AND Age < 12

Combining faceted searches within layers with multilayers is a powerful way to visualise and compare nuances in your data. You can create a map that shows different aspects and nuances of any map layer. For example, if your layer contains information on different categories, you can create seperate maps of each category, then recombine them in a multilayer for comparison. This is easiest to understood with examples.

First we need to isolate the facets with a search within a layer, then put them back together with a multilayer to compare and contrast them.

Example A: Compare facets of the Prisons layer
The Australian Prisons layer contains information about different kinds of prisons, and we’d like to see these differences on a map. Specifically we’d like to compare prisons for convicts, for women, and reformatories. Some of this information is in the description of each prison.

  • Go to Advanced search
  • Enter ‘convict’ in the search field.
  • Tick the box to search ‘Description’.
  • Enter the name of the layer you want to search within in the ‘Layers’ field (ie: Australian Prisons).
  • Click the ‘Search’ button. The search page would look like this:
  • When you see the Search Results click the ‘Save Search’ button, and give it a memorable name, like ‘Convict Prisons’.
  • Repeat for the search ‘female’ and ‘reformatory’.

Now you have 3 saved searches. You can find them under My searches.

  • Go to My multilayers and create a new multilayer.
  • Click the ‘Add Saved Search’ button to add each layer.

Now you can view these different facets of your layer.

Example B: Compare Arrests for Vagrancy By Country of Birth

The layer Vagrancy Arrests, NSW: 1862-1880 [Coleborne, 2023] includes a lot of extended data about people arrested for vagrancy from historical records police records. The ‘attribute’ or column called ‘CountryOfBirth’ indicates which country the person came from. There are some from China, Brazil, Jamaica, Canada and America as well as people from Europe. In this example, given the troubled politics between Britain and Ireland at the time, we would like investigate attitudes of the British colonial government to, and the social status of, Irish immigrants. We can do this by seeing if there are any differences between the arrests of people from England, Scotland and Ireland.

  • Go to Advanced search
  • In ‘Layers’ type and select the layer ‘Vagrancy Arrests, NSW: 1862-1880’
  • In the ‘Extended Data’ field enter: CountryOfBirth textmatch ‘Ireland’
  • Click the ‘Search’ button. The search page would look like this:
  • When you see the Search Results click the ‘Save Search’ button, and give it a memorable name, like ‘Irish Vagrants’.
  • Repeat for the search ‘England’ and ‘Scotland’ and save the results for each. (You could also try the other countries of birth. These are interesting historically and biographically, but too few for statistical comparison: Born at Sea, n/a, China, Australia, Germany, Canada, America, France, Brazil, Jamaica, Isle of Man.)

Now you have 3 saved searches. You can find them under My searches.

  • Go to My multilayers and create a new multilayer.
  • Click the ‘Add Saved Search’ button to add each layer.

Now you can view these different facets of your layer.

The earth is curved in three dimensions. This makes statistics on coordinates hard. You can’t simply run statistics on the latitude and longitude. Longitudes nearer the poles are much closer together than those nearer the equator. Calculations generally need to be based on distances between coordinates, not the coordinates themselves, and calculating those distances in 3D space and time. So we have built some common and useful statistical methods into TLCMap to make it easier.

Understanding what these figures mean and how they are calculated is important to avoid misinterpretation. Even the simplest statistics can be very informative.

Sometimes these methods don’t necessarily ‘prove’ something, but they are very useful for a deeper understanding of spatial and temporal information. Often the process of analysis involves some back and forth, as you find some methods uninformative, others interesting, and adjust parameters, learning and developing new questions. In many cases, a number alone is uninformative, and the value comes from comparison of one layer with other layers, or one figure with another.

These methods are useful for:

  • interpretation of information
  • spotting patterns you cannot see
  • confirming, quantifying or questioning patterns you can can see
  • quantifying, measuring and comparing differences between different types of location or times
  • providing quantitative evidence to support an argument

To do some quantitative analysis of map layers, browse to a layer and select from the following options under the Analyse menu.

These are common statistics that are fast for the computer to process. Some of these results can be seen on a map by clicking ‘View Map’. Results can be downloaded for further analysis in spreadsheets or a GIS.

Total Places

The number of places in this layer.

Area

The area of the ‘convex hull’ of the layer.

Convex Hull
The ‘convex hull’ is the shape, or polygon, you’d make if you drew a line around all the outermost places in the layer. The coordinates of this polygon are provided in case you want to use them in a GIS system. You may wish to compare the area of one type of information with another.

Density

The number of places per square kilometre.

Centroid

The midpoint of the layer.

Bounding Box

Bounding Boxes are often used in describing map layers, this is a box drawn along lines of latitude and longitude that includes all points in the layer.

Most Central Place

The place closest to the centroid of the layer.

Most Distant Place From Centre

The place furthest from the centroid of the layer.

Distribution – Average Distance from Centroid

This measure of distribution can give you a sense of whether, or how much, places are spread out, evenly spaced, or mostly concentrated around a point, or far from the middle.

Distribution – Average Distance from Centroid / Area of Convex Hull
By dividing the Average Distance from Centroid by the area of the convex hull, we can get a measure of how spread places are, or how distant or concentrated they are on the middle – in a way that is more comparable to other layers. For example, we might want to compare a layer that covers a very large area, with one that covers a relatively small area to find out if the pattern of one sort of thing is more concentrated or more spread out. The Average Distance from Centroid for the first layer, would be very large compared to the other, but this enables us to consider it as a proportion. For example, let’s say you are wondering if music performances are spread out, or cluster around a place that might be famous for live music, and whether this is different in rural compared to urban areas. Music venues in rural areas are probably always much further away from each other than in the city simply because spaces are larger, but by treating it as a proportion of the whole area, you can make a comparison of distribution between small and big areas.

The advanced statistics are more complicated and take longer to run. For large datasets you may need to wait a minute for the results. Some of these are common statistics for which there are plenty of good explanations on the internet.

Min distance between 2 places
The 2 places that are closest together. After finding the shortest distance of all places to any other place in the layer, this is the shortest of all.

Max distance between 2 places
The two places that are furthest apart.

Average distance between places
Generally, places are this far apart from each other. After finding the distance between each place to every other place, this is the average distance between places.

Median distance between places
This is the median of all the distances between places. If you put all values in order, the median is the middle one. If the distances in kilometres between each place and each other place were 1,2,3,5,9,29,145 then the median distance is 5. This is useful when comparing with the average to get a sense of the distribution. The average of this layer is about 27.7. The median is much less than that. This means there are more lower values than higher values. This means distances between places tend to be closer together, but there’s a few ones very far away.

Standard Deviation of distance between places
The standard deviation also indicates the distribution. It’s how much places tend to deviate from the average. If it is low it means most distances are about the average. If it is large it means the distances tend to very different. This is useful for comparison. Eg: if the standard deviation of one layer is much higher than another, it means places in the first layer are more scattered and in second layer are more clustered close to each other.

Average Min distance between neighbouring places
The average distance between places and every other place may not be what you really want to know. To understand how close to each other places are, you may not be interested in the closeness of every other place, but only the closeness of the nearest place. It might be interesting to know how close campsites, waterholes, or neighbouring homesteads tend to be, because that would provide an indicator of the relative attractiveness of this kind of country to that, or indicate population density, or provide insights into historical behaviour. So this is the average of the minimum distance between each place and each other place – ie: it is the average distance of each place to its closest neighbour.

Median Min distance between neighbouring places
This is the median distance between each places’ closest neighbour, rather than all other places. See ‘median’ above for how this can be used with the average to understand distribution.

Standard Deviation of Min distance between neighbouring places
This is the standard deviation of distances between each places’ closest neighbour, rather than all other places. See above for an understanding of standard deviation.

Cluster analysis is a way to identify places that are closely grouped together. There are many different methods available, so we provide some of the common and easier to understand methods. The resulting clusters often depend greatly on the parameters you set. It’s often not a case of the analysis simply giving you a definitive answer of what the clusters are, but of tweaking the parameters to interactively interpret what clusters make the most sense and why.

Detailed descriptions of these methods can be found on the internet, but here are some brief summaries:

DBScan

DBScan creates clusters by finding places within a set distance from each other. If a place is further than that distance from any other place in the cluster, it is placed in a separate cluster. Eg: If the distance is set to 4km and A is 6km away from B, then A and B are not in the same cluster. If C is 4km from A, then C is in the same cluster as A. If C is also 3km from B, then they all join up and A, B and C are all in the same cluster.

You set the distance. This means you may do a bit of trial and error in finding a good distance to make sensible clusters. Sometimes you may find that there is a threshold at which clusters start to form, and at which every point is alone in a cluster. You may also find that the clusters don’t change much within a certain range, which would make you confident that those are the clusters. This tweaking of parameters for clustering can work in tandem with your knowledge of the field to confirm, deny, or illuminate unexpected patterns.

One advantage of this method over some others is that it can find clusters that are unusual shapes, such as a long cluster curling around a small central cluster, or donut shaped clusters.

Bear in mind that what we understand as ‘near’ or ‘far’ is not only measured in kilometres. Travel time might be important and 1km is very different in mountains to plains. The concept of how far a person might feel is normal to travel. In the city, 1 hours travel time is a long way, while in the outback, it is normal to travel that far. This might skew clusters such that urban or mountain areas that would be many clusters are grouped into one, while rural or flat clusters aren’t identified as clusters.

You can also set the ‘Number of Neighbours’. This is an optional parameter enabling you to set the maximum number of places that can be in a group. As the groups are built up, even if a place is within the set distance, if the maximum number of places is exceeded it will be placed in a different group. This may not be relevant for many cases but can be handy for assigning a certain number of places, for example to a group of people to work on.

K Means
This is one of the most common clustering methods. With K Means you set how many clusters you want to group places into, and the algorithm will find a good fit for each place into a cluster. The way this is done is by forming small groups and finding the average distance to the centroid of that group. Places are shuffled around until the average distance of each place to the centroid of its cluster is lowest. This means the clusters produced are those that are the most tightly packed around a central point. This is likely to identify clumped clusters, whereas DBScan can identify oddly shaped clusters.

Places can be clustered according to how close together they are in space, or in time. If there are places in a layer that don’t have dates, they aren’t included in the analysis.

The way we have implemented temporal clustering is similar to the DBScan spatial clustering method. You set an amount of time. If events have dates within that amount of time, then they are added to the same cluster. If they are separated by a span of time greater than the set amount of time, they are in separate clusters. You can specify time in years, days or both. Note that this doesn’t adjust for leap years or non leap years. It may be informative to compare spatial clustering with temporal clustering. This may show a progression of events across different places at different times, or bursts of activity in the same place at different times.

A common question is whether one sort of thing is close to or far from another sort of thing. Note that asking whether A places are close to B places is not the same as asking if B places are close to A places. For example, asking if music venues are usually close to restaurants is different to asking if restaurants are usually close to music venues. We might expect that music venues are close to restaurants as people often want to eat when they are enjoying a night out at music. On the other hand, people going out to eat aren’t always on their way to hear live music. We might expect that music venues are usually close to restaurants, but that there are many restaurants far from music venues. Another way to imagine this is if you have a layer where you only have places within one city (A), while you have another layer of some other type of place for the whole country (B). Asking if A is close to B can give a sensible answer, while asking if B is close to A, is asking if places all over the country are close to the city you have data for.

You may be specifically interested in how close one sort of thing is to another, for example to see how far homesteads generally were from towns, to get some sense of how many hours or days ride it would typically have taken. Eg: if the Average Min Distance were 20km, we might infer homesteads were typically within a day’s ride from towns. In other cases, these figures might be more meaningful by comparison – are music venues or sports venues closer to restaurants? Are music venues closer to restaurants or train stations and bus stops?

To analyse closeness, go to a layer for which you want to know how close they are to places in another layer, and chose ‘Closeness Analysis’ from the dropdown. Then, in the box, start typing the name of some other layer that you want to compare to and choose from the options. Click ‘Analyse’.

The following results can give you an indication of typical ‘closeness’ of one thing to another. The two layers are ‘A’ and ‘B’.

Max Distance
The maximum distance between any place in A and any place in B.

Average Min Distance
After finding the minimum distance from each point in A to any other point in B, this is the average minimum distance between A and B. This is probably the clearest and simplest indicator of how close A typically is to B.

Min Min Distance
The nearest any place in A is to a place in B. Of all the shortest distances from any place in A to any place in B, this is the shortest.

Max Min Distance
The furthest any place in A is to the nearest point in B. Of all the minimum distances from A to B, this is the furthest.

Median Min Distance
Of all the shortest distances from places in A to places in B, if you put them in order from shortest to longest, this is the middle one.

Average Min Distance / Area
This provides a common measure of ‘closeness’ regardless of scale. This is the Average Min Distance, described above as a proportion of the total area. By making it a proportion of the total area, we can compare the ‘nearness’ of one thing to another, with the ‘nearness’ of one thing to another, but regardless of scale. Eg: at a national scale you might regard things as ‘close’ to each other if they are within 10km. At a city scale they might be close to each other if they are within 1km. Another way of thinking about this is if you were to look at a national scale map and see with your eyes that points in A were close to points in B rather than scattered randomly, and you also could see something like that but at a smaller scale, you would think of both cases that places in A were close to places in B, even though in the first case are much further than the distances in the second case.

Min Min Distance / Area
This provides a common measure of the Min Min Distance regardless of scale. See above for a description of why we divide by area to arrive at a proportion.

Max Min Distance / Area
This provides a common measure of the Max Min Distance regardless of scale. See above for a description of why we divide by area to arrive at a proportion.

Median Min Distance / Area
This provides a common measure of the Median Min Distance regardless of scale. See above for a description of why we divide by area to arrive at a proportion.

Web services are typically used by developers. This enables data in GHAP to be accessed by computer and used in other applications or other websites.

You can construct a query using a URL, with GET parameters, to return results in formats that computers can process so it can be used in other systems. E.g. a system that needs to search for Australian place names could call the TLCMap Gazetteer. 

Doing a search through the normal GUI interface produces a URL with the same get parameters the webservice would use – simple. To get this as KML (a species of XML) or JSON, just add an extra parameter to the query string. 

E.g. to fuzzy search for ‘Newcastle’ within NSW and get the results as KML just use: 

http://tlcmap.org/ghap/search?fuzzyname=newcastle&state=NSW&format=kml

Here are the full details (note that filters are treated as logical AND conditions, i.e. set intersection. There is no OR functionality.) 

Aliases that reflect Trove web services APIs have been added for ease of use.

Parameter Trove Alias Description Constraints 
id  The TLCMap unique identifier for the record. The identifier is base 16 with any amount of digits. The first letter is a namespace prefix – at time of writing there is only ‘a’ for ANPS gazetteer records and ‘t’ for TLCMap community contributed layers. The ANPS record can also be obtained with anps_id (see below). 
name exactq An exact match between the input and placenames in the registry Searching ‘Newcastle’ will only show exact matches of ‘Newcastle’ not entries such as ‘Newcastle City’ 
containsname  Match a substring Searching ‘castle’ will match ‘Newcastle’ 
fuzzyname A fuzzy search that first searches for entries where the placename CONTAINS the input (%input%), and then checks for placenames that SOUND LIKE the input (mysql soundex). Orders by exact match, starts with input, contains input, then most sounds like input Can handle slight typos (eg ‘Nwecastel’), but must start with the correct letter. Needs adjusting or a better solution 
anps_id  Exact match for an item in the registry with that anps id  
lga l-lga An exact match on the LGA for registry items. Input form contains an autocomplete feature for lga. Unsure if gazetteer contains full LGA data for all entries 
state l-place Search only for entries in this state  
from Search for entries whose anps_id is equal to or greater than this value  
to Search for entries whose anps_id is equal to or less than this value  
format encoding Return the result in the given format Formats are: html, csv, kml, json Selecting csv will automatically download the file instead of displaying in browser 
download  Will automatically download the results to file if download=on Will only work if format is kml,json, or csv Options are on or off(default)\ 
paging Specifies how many results to display per page when viewing in browser. Choosing a lower paging will speed up loading, as it only queries x results at a time. Do not use for non-html formats, it will simply limit the output to x results. If you want your kml/json results split use chunks instead. 
chunks  Split the download into x chunks for kml/json Downloads a zip file with content listed as x children, with a parent/master file referencing them Requires format as kml/json and download=on Needs further testing, not currently on the form. Unsure of how some geographical programs can handle parent/child outputs 
bbox  Specifies a bounding box to search for results within, using decimal format for latitude and longitude. The order is minimum longitude minimum latitude maximum longitude maximum latitude Eg: bbox=143,-34,144,-33 Shows results where latitude is between -34 and -33 AND longitude is between 143 and 144 Requires all 4 to have an input or it will be ignored. Can use either commas or %2C to separate values Gets a little confusing with negatives sometimes, will be simpler with a map widget 

Occasionally you might encounter problems while using GHAP. In this section we’ll cover some common problems and how to fix them. 

The matching the TLCMap staff have done to obtain latitude and longitude is automated. This means there may be some errors. Please contact us about any errors in places.

Of 334208 places in the ANPS data, there were a few thousand that the TLCMap team could not associate with any specific point, so the LGA was used instead. There remain only 1328 places for which we could not devise an automated way to find any coordinate at all. The way the coordinates were obtained is indicated in the results in the ‘Original Data Source’ field. In some cases, ANPS also provides more detailed information on where they obtained the data. 

When you are creating a layer, GHAP may fail to upload all records for two reasons: 

  • Errors: We attempt to identify any problems with the data, such as badly formatted dates or coordinates, and report them. However, some records may be imported before the error, leaving the job half done. Check the row after the last one that was successfully added for any potential problems (e.g., badly formatted coordinates or dates, blank ‘name’ or ‘title’ etc). 
  • Very large layers (e.g., more than 5000 records): Uploading large layers can take a minute to process, so please be patient. In some cases, the layer is simply too large to handle and not all the layer is added. Also, be aware that visualisations of very large layers may take 30 seconds or so to load. This is a limitation of the web that we cannot easily resolve. 

Solutions

  • If all records were not added, you can simply upload another file containing the missing ones (corrected). Uploading another CSV simply adds the new records to the same layer. 
  • Try breaking the file into several smaller files and uploading them one by one. 

Very large amounts of information (usually around 5000 search results) may cause processing errors and time outs. 

If you are looking at one of your own layers, it may not display if it is set to ‘private’. The 3D visualiser is built using the ArcGIS javascript API. If the data is private the ArcGIS server can’t access it. 

Solutions

  • Try breaking the file into several smaller files and uploading them one by one. 
  • Set your layer to ‘public’. 

Every now and then, all the code for a page cannot be loaded quick enough, and GHAP stalls. 

Solution

This is usually fixed by reloading the page, or forcing a hard refresh (hold down ‘shift’ and click the page refresh button).

Sometimes text on the web displays with question marks, little squares in it, or is garbled (eg: d�j� vu) This is a common problem on the web, especially when text is cut and pasted from MS Word or Excel. This tends to happen for text that isn’t in the basic Latin alphabet, such as letters with accent marks (eg: déjà vu), or for non-Latin writing systems (eg: 中国) or special symbols (eg: © ♫). 

GHAP uses UTF-8 character encoding to ensure support for the full set of UTF-8 characters, which includes almost all languages and writing systems of the world, including many dead languages and many Indigenous writing systems, as well as a wide variety of symbols such as maths symbols and music notation. UTF-8 is now the de facto standard character encoding and used by most systems, especially on the web. 

If you are saving a CSV file from Excel, unfortunately, depending on the version or situation, it may save the CSV file in ‘ANSI’ encoding instead of ‘UTF-8’, which is the de facto standard character encoding.  

Solutions

  • When saving a CSV file from Excel, under ‘Save as type:’ select ‘CSV UTF-8’. 
  • If you need to get more advanced, try the free text editor Notepad++ which has tools for inspecting and changing character encodings. 

A recent problem with Google Chrome is that it disabled 3D rendering, which the 3D terrain view and Temporal Earth depend on. This type of issue could happen in other browsers too. In some cases, the map simply doesn’t show up, and in others it gives an error message that might mention WebGL. 

Solutions

  • In Chrome, go to chrome://flags and set ‘WebGL Draft Extensions’ to ‘Enabled’. 
  • For other web browsers, visit http://get.webgl.org to verify that your browser supports WebGL. If it doesn’t, upgrade your browser. If it does, you may need to change your settings. Use your favourite search engine to find how to enable WebGL in your browser, e.g. by searching for the phrase ‘Enable WebGL’. 
  • 3D maps and Temporal Earth can take some time to load, so allow 20 seconds to start seeing something, especially on the first visit, or if you have a slow connection or old computer. Also, try refreshing the page.