We aim to answer all the following questions over 2019-2020. In some cases we know there are answers but haven’t yet documented them, or we need to do more research. In some cases we will simply give direction to easy solutions that already exist, while in others we will develop solutions. Questions we have not yet documented solutions for are greyed out.

General

There’s a few places you could start digital mapping. Whatever you use, make sure if you put any work into using a system, you can get that work out again, such as by saving or downloading the data in a standard format, such as CSV, KML, KMZ, GeoJSON or other standard spatiotemporal file formats. These are essential file formats for moving data from one system to another in order to access the different capabilities of those systems, and in order to deposit research data in archives.

Google My Maps

Google My Maps is a great place to start. In a matter of minutes you can get a web map going, and in so doing learn a few basics about web mapping. Do tutorials or look at the documentation, or better still, just log in and start playing. With Google My Maps you can add points, lines and shapes to a map and add information to them. You can import data, and you can export your map as a KML file. You can also embed the map in a web page.

Google Earth

Google Earth is a desktop application, so its main drawback is that you cannot share your work so easily on the web, unless you export a KML file and add that to a web mapping system. Nonetheless it is a very powerful tool for doing your map related research. It enables putting points, lines and shapes on a map, with information about them, with other features such as 3D visualisations, and a time slider if your data has time associated with it. Whatever other system you use it’s always handy to have Google Earth to open and manipulate KML files and to easily use 3D visualisations.

Quickly putting a few points on a map might be all you need, and even a simple map can have a powerful impact and make a point clear for teaching or research. If not, by trying to build your map in Google My Maps you will quickly learn its limitations and so get a clear idea of what you want to do. You may quickly run into the limitations of systems like Google My Maps and want to do more or different things, but there is a bewildering array of different software systems available, each for different purposes – how can you choose?

TLCMap Themes

We have found most digital mapping in humanities focuses on one of 6 themes (on our home page). These are not mutually exclusive and many overlap, but they are a convenient way to make sense of the vast amount of information about mapping software. These themes also provide access to systems we are developing that cater specifically to humanities needs – either making common tasks quicker and easier, or developing new functionality.

Tinker Geospatial Tool

Tinker provides a question and answer tool to help you find the right digital mapping software for the needs of your digital mapping project.

Digital Mapping In Humanities Tutorial

Self paced Intro to Digital Mapping

The TLCMap projects provide examples using TLCMap systems.

Anterotesis provides a long list of geohumanities projects generally: http://anterotesis.com/wordpress/mapping-resources/dh-gis-projects/

Digital Mapping is always one of the main streams at the international Digital Humanities conference, so you can have a look through the abstracts: DH2019

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

The proposed ‘Time Layered Cultural Map (TLC Map)’ is intended to be infrastructure. An apt definition of infrastructure is “Infrastructure can be described as that which creates the conditions of possibility for certain kinds of activities.”1 TLC Map will provide and improve the conditions of possibility for digital mapping in the humanities.

This includes basic mapping functionality with modifications for humanities researchers, a suite of re-usable tools and features, with ongoing development, driven by the needs of humanities research projects. Features relating to time will be a crucial factor for humanities enabling us to study and illustrate patterns across time as well as space.

There are some fundamental institutional problems that we can’t solve with a 1 year grant. We none the less need to acknowledge these problems and find tactical solutions to the make the best use of this rare opportunity and maximise the benefit of public money.

  • Lack of aggregated institutional support for ongoing support, maintenance and archiving of research software after development projects.
  • Casualisation and gig based employment. IT staff, will typically expect no less than a 1 year contract in the commercial sector, which pays more, making it difficult to find and retain highly skilled technical staff for small projects on short term part time contracts. It also means highly skilled staff with specialised knowledge (such as intersection of humanities, ethics and IT, or use of maps in VR environments) are likely to be lost to reliable employment with fair conditions.
  • Commercial outsourcing of all University IT services, making projects infeasible due to cost, leading to failures in service provision, inability to adapt to changing research priorities and requirements through dev+researcher collaboration, etc, etc.
  • Collapse of institutional research administration and support from millions of dollars in funding to precious time wasted on quotidian needs (eg: 1 year to organise EFT with the nations peak research funding bodies after grant approval, 8 month turnaround on server provisioning, 3 months to get a laptop for new staff, 2 months just to get a key cut so staff can access their place of employment, etc, etc).
  1. Brown, Susan; Clement, Tanya; Mandell, Laura; Verhoeven, Deb; Wernimont, Jacque Creating Feminist Infrastructure in the Digital Humanities DH2016 2016-03-06 http://dh2016.adho.org/static/data-copy/531.html

In short, yes, one way or another, but as with any system, make sure you export and back up your information, and deposit research data in an official repository.

An adequate longevity plan is difficult in the absence of institutional support or commitment to post project maintenance for eResearch generally, including digital humanities, even though this would be a small cost to protect thousands or millions of dollars in investment. This absence appears to be the across the tertiary research sector. There are legitimate concerns that new systems won’t last long or that effort put in and data might be lost. We address these concerns in a variety of ways, and recommend actions you can take to be confident your research and work survives any eventuality. Ideally more funding would be available through further grants, partnerships and institutional support but we know this might not happen, so we have the following strategies to ensure work remains whether ‘TLCMap’ continues to exist or not:

  • Independant Systems and Software ‘Ecosystem’: Where ever possible we are developing on platforms that have an existence independent of our project, including finance and user communities. These systems will carry on without us.
  • Data export in standard format: It is a requirement of participating systems that data be exportable in standard, interoperable formats. Individual researchers will have exported and saved their work and will be able to utilise it, perhaps in degraded forms, in other systems, or continue development with it in different directions, perhaps including enhancements.
  • Open Source: Using open source systems mean they won’t suddenly disappear because someone forget to renew a subscription, licence agreements change, or the provider goes out of business or discontinues service. A the very least in the event of complete failure an open source system can be run up elsewhere and backed up data (you did export and back up your data didn’t you?) can be restored.
  • Research Data Deposit: Research data should be deposited in an official repository and registered with relevant bodies. This also helps others find it.
  • ROCrate: ROCrate is adopted as a standard for archiving research data. This especially addresses a need where information may be useful in the long term but project funding and institutional interest cannot continue to fund maintenance and upgrades. All TLCMap systems should enable export in ROCrate format.
  • Spread Risk: TLCMap is not about developing a new system that attempts to be all things for all people, and reproduces functionality, and so competes against established system. Rather than a single system that risks complete failure if it fails to be adopted, risk is spread across a diversity of development streams and established software platforms.

Data

One of the most important things when starting many Digital Humanities projects is maintaining consistent, well structured data.

One common difference between humanities and STEM, is that humanities isn’t limited to repeatable phenomena. Scientific method depends on repeated observations, and repeatable experiment. While humanities’ unlimited scope includes unique, or highly complex and historically contingent situations, it can be usefully informed by ‘data’. This doesn’t necessarily mean reducing humanities or trying to justify humanities by making it more scientific.

Where to Begin?

Think about what types of information about your ‘objects’ of study need to be recorded and presented, ideally before you begin. Don’t let worries about data structure stop you from starting though. Often the structure becomes clear as soon as you start gathering the data, so it’s good to make a spreadsheet, try it out on a few examples and adjust. While it’s best to avoid late changes to structure so you don’t have to go back to the library or the field, you can always add a column if you missed something important. If it is important, or if on the other hand you are trying to gather so much data it’s not practical, you’ll probably realise early on, so just get started.

How do you make information well structured? Often it’s not as complicated as it seems. The simple answer is, “Just put it in a table under column headings.”

This is not well structured data:

The Mona Lisa by Leonardo Da Vinci, between 1503 and 1506, maybe 1517. Most famous painting.
Last Supper, 1495 – unknown, Da Vinci. Often referenced in popular culture, this work was…
Michelangelo, c. 1511–1512, Sistine Chapel. Commissioned by…

The artists, painting titles and dates are in different orders, the dates are stored in different ways, and sometimes the name of a single individual is stored differently. The descriptions are just notes and you’ll want to edit them later (that’s ok, but save yourself some trouble by making it as finished as possible).

This is well structured data:

PaintingArtistStart DateEnd DateDate ExactnessDescription
The Mona LisaLeonardo Da Vinci15031506c.The most famous painting in the world, etc.
The Last SupperLeonardo Da Vinci1495c.Often referenced in popular culture, this work was…
Sistine ChapelMichelangelo15111512c.This ceiling decoration was commissioned by, etc.

That’s not hard to understand. That’s the main point but there’s a few more things worth bearing in mind:

Numbers, Dates and Text… and Notes

Software usually handles different kinds of data differently. The main distinctions are numbers, text and dates.

Store numbers as numbers without adding any text to them. Eg: if there is a column for ‘Quantity Of Grindstones’, don’t put ‘About 32′. Put ’32’. This means we can use those numbers to arrive at (estimated) totals and averages. In humanities we are often dealing with ‘data’ that isn’t measured strictly or consistently as in science. Text can’t be added and subtracted so leaving it as a number allows calculations to be made, which you can add any caveats and explanations to later. (eg: a column called ‘notes’ that says, ‘Values are conservative estimates only, based on Emerson’s diaries*…’)

Text allows for anything at all, but sometimes you want to use it for consistent named categories.

Dates and times are tricky to handle so keep to a consistent format and also don’t add extra text to them. Eg: stick to the dd/mm/yyyy HH:MM:SS or some other common format.

Be consistent

Always write the same thing or sort of thing in the same way. Eg: decide if you want to just write ‘da Vinci’ or ‘Leonardo da Vinci’ and always write it that way. If you record a date in the format 29/04/2020 don’t change to 29 April 2020.

What To Gather?

You may want to break this up differently, specifying whether the first or second date is uncertain, or using only the finishing date if that is all that is relevant, and adding whatever other columns are pertinent. What information you put in depends on:

  • the needs of your project
  • the time, money and effort you can put into collecting it
  • the input requirements of the system you want to add it to

More Is Better

If you can gather more details do. It’s easier to take out subsets of information than for you to revisit every data item.

Don’t use MS Word

Avoid MS Word for recording data. Use it for writing letters and essays. Although you can make tables in MS Word, and they are better than just notes, they will ultimately need to be copied to some other format that a computer program can more easily handle. The most commonly used tool, and much easier for a computer to handle, is Excel. If you make columns in Excel you are off to a good start and will save everyone, including yourself, a lot of time and headaches later. This is because Excel files can be saved as .CSV files which are easy for computers and programmers to work with. (Note you can still make a mess of an Excel or .CSV file, just keep all the data broken up in columns with only one type of information in each column)

Structure As You Go

It’s easiest to gather your information in the right structure as you go, rather than transcribe it later.

Just Ask

If possible, ask someone what fields (or column headings) are required, or if your data structure is good. If you intend your data to go into a particular system, check what requirements it has. Eg: If you want to put your data into Google maps, for example, even if you’re not sure about the technical standards of KML and other acronyms, you can see that you should at least have a ‘longitude’, ‘latitude’, ‘name’ and ‘description’ for every point you want to plot. If you at least have that in a spreadsheet, it can be converted to the right format.

One Type Of Information, One Column

If types of information can be distinguished, split it up into more columns. Eg:

ArtistPainting
Grace Cossington SmithThe Bridge in Curve (1930)
Katsushika HokusaiThe Great Wave off Kanagawa (1833)

becomes

ArtistPainting TitlePainting Date
Grace Cossington SmithThe Bridge in Curve1930
Katsushika HokusaiThe Great Wave off Kanagawa1833

This structure or that?

There can be a bit of an art to designing structure. Depending on the nature of your research and the data you can find, you might organise it one way or another. Eg: Let’s say its about artists and places they are associated with. You could do it like this:

ArtistBirth PlacePlace of Death
Sydney NolanCarlton, MelbourneLondon

or like this

ArtistPlacePlace Relation
Sydney NolanCarlton, Melbournebirthplace
Sydney NolanLondonplace of death
Sydney NolanHeidilived
Sydney NolanBirdsvillephotographed

The first is more suitable if you are only specifically interested in places of birth and death, but would result in too many columns, many with empty data, if you wanted a column for every type of possible place. The second allows for any kind of place associated with the artist, but if possible, the ‘Place Relation’ should still use consistent categories.

Complex Structures

Information structure can sometimes get a bit complex. Let’s say you want to have some extra information about the Artists, such as when each was born and died, whether they were sculptors and/or painters, what cities they worked in, who their patrons were etc. You don’t want to add all that information to every row in your table of paintings. You need a seperate table that just stores the artist information once for each artist. You can then relate this back to the painting by the artist’s name. This is the structure of a ‘relational database’. You can still gather this data in Excel for convenience, but make sure you are consistent in using the artists’ names, so that it will match up across tables. Keeping these tables makes it possible to convert the information into a proper database, which can then be used to mix and match, filter and display the data in all manner of ways, including for the web.

A Paradox of Structure and Flexibility

Why structure information this way? It seems rigid and inflexible but well structured data is what enables computers to be flexible. A computer doesn’t care if there a few or a million records, if they are structured in the same way it will process them quickly. It can filter and mix and match the information, change formats, run calculations, and pass the data to visualisations. Without a consistent format the computer can only display it the way you put it in – it can’t do anything with the data. You lose the ability to manipulate it, and so badly structured data, while flexible in your terms, is inflexible for a computer.

In the badly formatted art information above, the computer has no way of knowing which text it should treat as an artist, which as a painting name and so on. If it is in columns, the computer can treat everything in the first column as an artwork, everything in the second as an artist, and so on. ‘Structured data’ is part of working with computers as a medium – you don’t normally work clay with a paint brush, and you don’t normally spin paint on a potter’s wheel. To work with computers, use structured data.

So while lots of different systems require different formats, the most important thing is to be consistent and structured. Even if you don’t know what specialised formats it might have to be in later, if it is well structured it’s much easier to write a small program to convert it all into the right format for any system.

If it’s too late and you only have badly structured information, even if it takes hours or days to convert your notes into well structured data, it’s a small effort for the benefits of being able to query it to identify relationships, extract subsets for other purposes, generate lists for publications, run it through statistics applications to generate graphs, plot it on a map, make an online gallery, turn it into social network diagrams, display it on the web and whatever other relevant thing a computer can do.

Link a single place to an external site with a customised web page about that place.

We can consider two ways of structuring information that is to be mapped. Each relates to how the information appears and is interacted with on the map. These two different approaches touch on some conceptual issues in Information Technology, but can be explained in plain language, and those who enjoy critical theory in humanities will note how it relates to themes such as decentering, non-linear mixed media reader constructed narrative, and ‘rhizomes’. The following includes both the simple explanation and example, and a slightly more detailed account for those interested.

Single Point

One way attempts to display all the information (text, images, links, videos, etc) accessible through a single point that is clicked, and which then shows this information set out like a prepared document or curated collection. This is often how people first imagine it working – you click the point, and in the pop up window might be a description and some related images, and perhaps a video and a list of relevant links, etc. So for example, lets say you have a series of places relevant to a persons life, you want to map all those places, and attach all the photos, letters and other materials to it.

You would want to set this out in a coherant narrative related to that place. This makes sense for some cases, but quickly runs into problems and has some drawbacks. Perhaps there are a hundred or more images, or for one site a 3000 word account is required, not just a paragraph? These are too large to fit into the small pop up or side bar when you click the dot. It would not be user friendly.

At present TLCMap, focused on mapping, doesn’t allow uploading of images or other media into a collection, which would be required for embedding in a little popup window. Collections such as these belong in collections systems. None the less, such a collection system, if it allowed geo-coding of images, could provide a map, which when clicked, provides a link to any of those geo-coded images – which is precisely how we recommend TLCMap be used – to provide the map, linking to the page (as in this approach) or the specific item (as in the below approach).

There are many scenarios where you do wish to set out in a neat narrative, perhaps with embedded images, or other cases where you wish to provide access to hundreds of photos of a place through a single dot on the map. There are some ways to achieve a curated presentable multimedia story with maps, such as Story Maps, that you may use instead of TLCMap to achieve that. The way to achieve this with TLCMap though is to set up your project website seperately, and integrate with TLCMap for the mapping component. Eg: if you have a collection of images, set up a web collection management system. If you want a nicely formatted web page with text and images about a places set that up on your website. Then create a TLCMap that simply links the dot for that place to the collection or to the web page. The TLCMap can also be embedded in your website, so it is seamless within your site.

Eg: You have a website on Goldfields. You have webpages for major sites such as Bendigo and Kalgoorlie. Each of these gives a history of the town with pictures and a breif documentary video. This would be too much to cram into a pop up window on the map. Simply create a map for each town and use ‘linkback’ to link to the relevant webpage. For example, an Excel file, to be saved as a CSV for import, might look like this:

DateLatitudeLongitudeTitlePlacenameDescriptionLinkback
1851-37.5601143.8549Gold Found in BallaratBallaratBallarat wasn’t the first place in Australia where gold was found, but it began the first major gold rush that created a population boom and changed Australian society.https://en.wikipedia.org/wiki/Ballarat
1851-36.7568144.278Gold Found in BendigoBendigoFour people claimed to have first found gold in Bendigo. Bendigo became one of the major centres for early gold mining.https://en.wikipedia.org/wiki/Bendigo
1857-23.0533150.1853Gold Found in CanoonaCanoonaThe first payable gold found in Queenslad was at Canoona, however after the gold rush little gold was found, leaving many destitute.https://en.wikipedia.org/wiki/Canoona
1893-30.7487121.4654Gold Found in KalgoorlieKalgoorlieThe gold find at Kalgoorlie lead to another major gold rush.https://en.wikipedia.org/wiki/Kalgoorlie

This method works best if you want to carefully control the presentation and order of information related to a place – simply link to a web page on your site that does that.

Directory or ‘Inverted Heirarchy’

Associate each and every resource (photos, text, events, etc) with a place.

If we have a large collection, such as hundreds of photographs that we want to geolocate and then map, or we have a collection of information of many different media or types that we want to map, such as scanned letters, photographs, transcripts, videos etc, rather than putting a dot on the map for each place and providing a single page that puts all that information together, we can turn this upside down and attach a place to every one of these resources. You would list every photo, every letter, every video, along with its coordinates.

This treats the information more like ‘data’ and although some control is lost over the order of the presentation it is much better for making information discoverable, queryable and presentable in different ways.

In information design this might be thought of as an ‘inverted heirarchy’ because instead of starting at the top, and thinking about all the things that come under that heading, and its subheadings, we start at the bottom, with all the things, and attach information attributes to them. This can also be thought of as a ‘directory’ as opposed to a taxonomy.

A good way to explain how this the structure can solve problems is the platypus. People in Europe thought the platypus must be a hoax because it didn’t fit neatly into any biological category – it has fur and suckles its young like a mammal, it has a bill like a bird, it lays eggs like a bird or reptile, it has a pouch like a marsupial, it’s venomous like a reptile or insect, and is amphibious. So what phylum or class should it be put in? All of the above? They had to create a special category for it, along with the echidna – ‘monotreme’. Platypuses and echidnas are an exceptional case in biology, but in the world there are often scenarios where there are many things like platypuses, and if a category needs to be created for every exception the whole point of a categories, where things must be in mutually exclusive branches, breaks down. To deal with this we look at each thing and ask what ‘attributes’ we use to describe it. These are analogous the columns in a spreadsheet, eg: for the platypus the ‘attribute’ ‘lays eggs’ is ‘yes’. The attribute ‘habitat’ with a choice of land/water/amphibious, like the frog, will be ‘amphibious’. The attribute ‘Outer layer’ might be ‘fur’, etc. In this way rather than looking in the category for ‘mammal’ containing all things that suckle their young, and failing to find a platypus because it’s in another box called ‘monotreme’ we can ask queries like, ‘Show me all things that suckle their young.’ and get dogs, cows, whales, people and platypuses, while at another time we can ask ‘Show me all the things that lay eggs’ and get a list including parrot, albatross, emu, platypus, etc.

So what does that have to do with mapping in the humanities? When designing TLCMap we had to find a way to handle a common requirement for mapping across many disciplines and diverse and idiosyncratic projects. One thing these will all have in common, if they are to be digitally mapped, is coordinates (and ideally dates, if they are to be mapped in time). All of the things that are to be mapped are of different kinds, with some similarities and some differences – the attributes are inconsistent, but the attributes of having coordinates, and possibly dates is something in common we can design a system around (though yes, we are also working on systems for ‘maps’ that don’t use geo coordinates). TLCMap has been designed to focus on the ‘directory’ or ‘inverted heirarchy’ way of doing things.

What it means for your project is that you can look at all things that you want to be mapped and attach the attributes lat,lng and optionally dates and placenames to them. This also means you can design subsets of things that you want to be mapped much more flexibly. Instead of having one map of the places, and everything associated with that place trying to fit under it, you can get subsets from the data to, for example, show a map of all the Goldfields which failed, all those in a certain state, or a map of just the photographs, or a map of just the goldminer’s letters, or a map of all the goldfield materials related to goldfields where there was organised uprisings and activism (assuming you have recorded all that information in your data). It also enables much more nuance in viewing the location over time. For example, when using the TLCMap timeline, you could see the different paintings, photos, letters in a single place over time, rather than have a single place, with a single start and end date, to contain all that information.

As a practical example, your data might look like this instead of the above:

StartDateEndDateLatitudeLongitudeTitlePlacenameDescriptionTypeLinkback
18511851-37.5601143.8549Gold Found in BallaratBallaratBallarat wasn’t the first place in Australia where gold was found, but it began the first major gold rush that created a population boom and changed Australian society.Eventhttps://en.wikipedia.org/wiki/Ballarat
18531854-37.5601143.8549Painting of the diggingsBallaratPainting by Eugene von Guerard of Ballarat’s tent city in the summer of 1853–54.Mediahttps://en.wikipedia.org/wiki/Ballarat#/media/File:Ballarat_1853-54_von_guerard.jpg
18541854-37.5601143.8549Eureka Stockade PaintingBallaratPainting of the Eureka Stockade by John Black Henderson (1827-1918)Mediahttps://en.wikipedia.org/wiki/Ballarat#/media/File:Eureka_stockade_battle.jpg
18511851-36.7568144.278Gold Found in BendigoBendigoA brief historical account of the town of Bendigo – four people claimed to have first found gold in Bendigo. Bendigo became one of the major centres for early gold mining.Texthttps://en.wikipedia.org/wiki/Bendigo
18581858-36.7568144.278Gold Found in BendigoBendigoUnknown artist – McPherson’s Store, Bendigo (c.1858). Watercolour.Mediahttps://en.wikipedia.org/wiki/Bendigo#/media/File:Charing_Cross_Bendigo_1853.jpg
1853-06-061853-06-06-36.7568144.278Bendigo PetitionBendigoSigned by over 23,000 miners the Bendigo petition was an attempt to get representation and reasonable taxation from the British Government.Eventhttps://ballaratheritage.com.au/article/the-1853-bendigo-goldfields-petition/

This is suitable for mapping many things of a single kind, or a variety of different types of resources, possibly in different ways. If you wanted you could break the spreadsheet down in different ways, for example to create a layer of just paintings, and a layer of just ‘events’, and a layer of ‘letters’ and then combine them into a multilayer, all of which would enable these to be viewed over time, presuming they have a date. This can create a lot of dots all in the same place – such as a great many all in Bendigo – but this can be handled with the ‘cluster’ style viewing option, which would show a large dot showing there are, perhaps, 43 photos located in ‘Bendigo’ which can then be expanded to show each individual thing located in Bendigo.

Which of these approaches you choose is up to you and the needs of the project. The former gives you more control over the narrative and order things are presented on your site, but the latter gives more data flexibility, and more discoverability by others doing a TLCMap search and each thing can still be linked to your site. Depending on the context it could be an advantage or disadvantage to dismantle the heirarchy and let people explore the items in their own non-linear way, making their own connections. TLCMap is more designed for the ‘directory’ structure though, and tends to be the most useful and recommended approach.

CSV files aren’t specifically for spatiotemporal data, but because they are a widespread format for spreadsheets that are easy to read across many different systems, they are often used for also used for spatiotemporal data. A CSV file may be created by saving an Excel file as filetype ‘csv’. ‘CSV’ stands for ‘comma separated value’.

KML is a standard goecoding XML format. This means it can be processed by a computer easily, and can also, to some extent, be read and modified by a human. Because KML is a standard format for geodata, it can usually be imported into other systems. One of our main aims is not to try to build one system that does all things, but to allow for and further the parallel development of different systems independently. Interoperability, then, is key. How this works in practice is often by making data produced in one system available to another in a standard format. Sometimes this is as simple exporting a KML file from one system and importing it into another. Another common format for geodata is GeoJSON. A good tip is to make sure you can get the effort you put into your system out of it again in some standard format.

GeoJSON is another standard for spatial data, but written in JSON, which is a popular way of structuring data for the web.

All of these data formats are stored as plain text, so they can be read and edited by a computer or human.

How to create KML files:

How to create GeoJSON files:

  • Use a tool like geojson.io
  • Many map systems allow export of data in KML format
  • Find data in repositories

Convert a CSV file to KML or GeoJSON

Often we have data in a spreadsheet, or that may be exported from a database in tabular form as a CSV file, that has columns for latitude and longitude.

An Excel spreadsheet can be ‘saved as’ in .csv format.

You can convert CSV to KML by importing it into TLCMap Quick Coordinates, Google MyMaps or Google Earth.

Alternatively you can find converters on the web by Googling ‘convert CSV to KML’ or similar, such as CSV to KML or MyGeoData Converter

Although not all data should be open for ethical reasons, TLCMap advocates Open Data and Creative Commons licensing.

Depositing data in a repository can help researchers and other find it. Some repositories are indexed by major search engines.

Based on advice from Alexis Tindall (ARDC):

  • If your University/research institution requires you to deposit research data with them, do so. Ensure you obtain a DOI so the dataset and its version can be referenced and cited.
  • Datasets deposited with your institution should feed through to ANDS – Australian National Data Service for improved discoverability. If not, the feed may be broken, so ask your institution to correct and fix this.
  • You can also deposit humanities and social sciences datasets with ADA – Australian Data Service. ADA provides some functionality for nuanced access to sensitive datasets. ADA integrates with AURIN.
  • Look for a data repository specialising in a relevant subject area, such as Pelagios Commons
  • If your dataset includes Indigenous information, especially if there are access restrictions, consider depositing with AIATSIS.
  • If you have a list of placenames we welcome contributions to GHAP (Gazetteer of Historical Australian Placenames).
  • To search for Humanities research datasets others have deposited, go to Research Data Australia.
  • You may also try National Map which sources data from these locations.

Depositing Data In More Than One Location

Based on advice from Tom Honeyman (ARDC)

Having materials deposited in more than one archive/repository can be a good thing for ensuring the preservation of materials. The best way to resolve this is to talk to both repositories/archives about their requirements, and to let them know the datasets will be mirrored. They should be able to advise what they prefer in this scenario. The institutional repository is quite likely to say go with the other one.

If duplicating datasets, the data and the metadata captured against them (including identifiers) should be identical. In practice though, requirements for metadata and the form of the datasets may differ between the two.

  • If there are differences between the datasets when they are received (or if they are transformed during ingestion) this should be noted in the metadata of both copies. Especially if one is a latter or revised version of the other.
  • Different or otherwise, the metadata should note which of the two is the “primary” copy, and in doing so the preferred method to cite the data.
  • Ideally, you should have only one identifier used in both places, but if you do get two (which is likely), see if they can both resolve one DOI to the other, and that both (or all) identifiers are listed in both metadata records.
  • The situation to avoid is confusion about where people should go to get the data, and which of the two sources to cite.

Humanities researchers often need to deal with information that is in some way vague or uncertain. For example, we may want to map a diary entry which says, ‘3 days north of the bend in the river’ or ‘late Spring’ but these need to be translated into specific coordinates and times in order to be placed on a map. Simply placing information on the map may give viewers the false impression that it is accurate or certain. This can have major implications if people use the information in an an appeal to authority (eg: the University says it was here at this time) to prove some case, potentially with legal implications. In other situations, users may misinterpret mapped information as complete such that gaps on the map seem to indicate nothing there, rather than no research done or data gathered there yet. In any case a common requirement requested by humanities researchers is the ability to represent vagueness in some way.

This involves many questions that could be handled with different data structures:

  • What kind of vagueness: margin of error in measurement / informed estimate based on multiple sources / infilled data / it occured within a range of time and place etc?
  • Do we want to represent that it is certain/uncertain or indicate a degree of uncertainty?
  • Can the system represent this with icons, fading, colour, shape, dotted lines, blurring?
  • Will this degree of accuracy cause users not to use it? Do we need data entry to include a figure for how accurate it is, or range of time and space within which it might be? Will this additional data be so onerous that people won’t bother or it takes so much time the project won’t finish?

All of these need to be considered and balanced against each other and the needs depend on the circumstances. Often simple answers are the best, and practicality dictates we don’t want to overcomplicate data entry, we don’t have time and money for extra detail, we need to work within/around and adapt established formats rather than create new systems, and we want users not to have to read manuals to interpret visualisations. At a base level, we could:

  • add a question mark to the end of a place title or specific attribute value to indicate at least that there is some uncertainty about a place eg: Brisvegas(?).
  • use a ring instead of a pin or a circle to indicate that the exact location is vague rather than having pin-point accuracy.
  • use the datestart and dateend attributes common in geodata standards to indicate a range of time within which an event occurred.
  • ensure the surrounding and contextualising information highlights and explains the issues around uncertainty.

None the less there is some research investigating nuances of representation of vagueness, eg:

There are several metadata standards for spatial information, sometimes overlapping and sometimes with more or less than seems needed. We will aim to ensure that any metadata standard used can at least be transformed into another common standard.

AURIN has already done work to establish this guide: https://aurin.org.au/legal/metadata-record-guide/ including a metadata tool based on an extended version of ISO 19115, which was used in the creation of the AS/NZS version ). AURIN’s original metadata standard work was funded under demonstrator projects with ANDS https://projects.ands.org.au/id/AP31

Dublin Core is also a good, well established standard to follow for set of basic metadata https://dublincore.org/

There may be any number of reasons. Here’s a few common problems:

Coordinates are back to front. Coordinates often appear as a pair, like this: -32.914154, 151.800702. Some systems assume latitude first and longitude second, while others expect the coordinates to be the other way around. Even within Google mapping systems they are expected in one way, and in other Google system another way.

Coordinates are in an unexpected format. Coordinates can be expressed in different ways: as decimal numbers, as degrees, minutes, seconds and so on. Check your data is in the correct format. If not, convert it using a conversion tool.

An invalid character or other glitch may be the problem. Computers are temperamental and very literal. Sometimes a whole system might not work because of a full stop in the wrong place. A single letter in a coordinate field that is assumed to be a number might make some systems fail. The only way to deal with this is to hunt down the problem and correct it.

To find and fix problems, try working with just a very small example of your data. If it doesn’t work, it will be easy to find issues and try different approaches. If the problem doesn’t occur, you can keep adding chunks of your data till you narrow down where the problem might be occurring.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

Time

There are many ways we think and talk about time. We aim to make available ways to structure temporal information and visualise it for different circumstances such as:

  • journey (eg: a ship, or ships on a journey is at certain points at certain times. The data structure is a series of points with times or durations.)
  • serial (eg: a sequence of events where they happen in a certain order, but there is no specific date. We just want to see that one happens after another. Eg: “first go to Dudley, then to Merewhether, Dixon and Bar, and then on to Newcastle and finally Nobby’s”)
  • migration (quantities of movement between places. Eg: not just this or those ships going from here to there, but for example 200 people go from London to Sydney, and 150 to Melbourne in 1960, 300 to Sydney and 350 to Melbourne in 1961, 256 to Sydney and 132 to Melbourne in 1962, etc)
  • frontier (movement of lines and polygons, rather than movement of a point along them. Eg: the Western Front in WWI.)
  • stationary change (places which stay in one place but with multiple properties changing over time. Eg: cinemas don’t move but change between being cinemas and not being cinemas, change managers, changed from showing Hollywood to Greek and Italian films etc)
  • cyclical (things which repeat in a pattern with amounts of time between each. These may be concentric cycles at different scales. Eg: indigenous seasons.)
  • calendar (things which happen repeatedly at certain times, eg: train timetables)

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on. (canoe time)

To be researched and documented. This may be an area we can improve on.

Processing and Metrics

We are improving on features in Recogito which uses Named Entity Recognition (NER) to automatically identify places and people in texts and produce maps of the places.

See ‘How can I get statistics and metrics on spatiotemporal data?’

To be researched and documented. This may be an area we can improve on.

‘Close’ compared to what? Handling statistics on elipsoid surfaces, and with time too.

To be researched and documented. This may be an area we can improve on. (least cost techniques etc)

Images, Virtuality and Visualisation

The following provide georeferencing tools that are free to some extent:

To be researched and documented. This may be an area we can improve on.

Google Earth can be used to draw polygons, set elevation and extrude them to create simple 3D shapes. For presentation you could simply use screen capture to make a video.

More information required for doing this in the web in an interactive 3D environment, and for doing more detailed architectural reconstructions. Also, an account is needed of how to handle not just coordinates but what floor things are on in city environments, etc.

To be researched and documented. This may be an area we can improve on. HuNI, M2M.

To be researched and documented. Omeka, WordPress, ArcGIS Storymaps, Google MyMaps, etc etc.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.

To be researched and documented. This may be an area we can improve on.