Discussion:
[talk-au] Victorian Vicmap Address Import Proposal
Graeme Fitzpatrick
2021-05-18 23:36:10 UTC
Permalink
Great idea, thanks Andrew!

Vic only, or are other states to follow at a later stage?

Thanks

Graeme


On Tue, 18 May 2021 at 22:30, Andrew Harvey via Talk-au <
I'm working with the National Heavy Vehicle Regulator (NHVR) on a proposal
to import Vicmap addresses for Victoria into OSM. For context, here is the
statement from the NHVR.
The NHVR is making a substantial investment in the creation of a
spatially based mapping tool to improve the efficiency and safety standards
of the Heavy vehicle industry within Australia. As the regulator, we are
particularly aware that not all heavy vehicles are suitable for all roads,
and wish to ensure that heavy vehicles are aligned to the most appropriate
roads wherever possible. To support the principals of open data the NHVR
has made a decision to utilise OpenStreetMap as its base layer map. As
such, we want to ensure that OSM is populated with the most accurate and
rich data possible. Whilst NHVR’s needs are primarily to utilise pickup and
delivery, depots and farms addresses used by the heavy vehicle industry, we
are proposing a mass import that looks to populate both commercial and
residential addresses to support the much more diverse use cases of the
wider community, as well as our own.
We are excited to work with the open street map community and we look
forward to any and all feedback sourced from this great community.
While my work on this is supported by the NHVR, the community ultimately
has the say on if this import happens, and how.
I've been preparing code and documentation on what I'm proposing for a
potential import at https://gitlab.com/alantgeo/vicmap2osm/, it's still a
work in progress at this stage, it will continue to evolved based on
feedback here and as I continue to work on it. I'm yet to put together a
formal import plan on the OSM wiki, as I wanted to get community feedback
first.
I'm keen to hear any feedback for this potential import, or answer any
questions.
--
Andrew Harvey
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Little Maps
2021-05-19 00:51:14 UTC
Permalink
Great initiative Andrew, re the question on where address tags will be placed
 is it possible to circulate a short list of options to facilitate discussion on this question? Thanks Ian
My main concern would be to decide where addresses will be added.
I would say this is a hot topic of discussion in OSM, and there are a few different opinions held by contributors.
I agree that we would need to discuss and have a consensus on where the address tags should go, for a potential import.
Sebastian Spiess
2021-05-19 06:47:12 UTC
Permalink
Hi Andrew,
indeed a great initiative and yes the NSW import has stalled way too
long.

You will also need to detail how to deal with Unit numbers. For the NSW
import there where many single houses that had several entries like 12A,
12B and 12-2 Lakewook Road. Do you import them as individual nodes? or
just one omitting A/B/2?

My comments, also based on some of my NSW import experience below in
line.

Cheers, Seb
1. How should we handle existing address interpolation ways? Should
these be left as they are or replaced with individually mapped address
points? I'm proposing we replace.
2. Should we also import `addr:suburb`, `addr:state` and
`addr:postcode` tags? I'm proposing we do.
I vote for adding the information. I have been adding it where possible.
In theory the POI should know in which State or LGA it sits but the
reality is that this does not result in the user having complete
addresses on the POI. E.g. restaurants don't have the suburb
automatically shown in OSMAnd.

I would also argue that the information is part of the full address of
the house/building. Else I dare say we should have a similar discussion
for phone numbers. We don't need to add +61 or (0)2 as this is implicit
by the POI location.
Given postcode regions aren't mapped, then adding these to the address
should be very helpful.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as
well include `addr:state`?
State and post code are part of the full address. I vote for including
it.
`addr:suburb` is similar to `addr:state`, suburb/locality boundaries
are already well mapped in Victoria. Since we have this detail from
the source data I think we probably should still include it.
3. `addr:suburb` vs `addr:city`.
Both tags are in use within Australia. According to taginfo
(https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A)
within Australia addr:suburb occurs 521 ,671 times and addr:city
562,542 times.
The iD address preset fields uses addr:suburb.
Victoria only has a handful of place=city objects
(https://overpass-turbo.eu/s/17vc), Melbourne, Geelong, Ballarat,
Bendigo, Shepparton, Warrnambool, Traralgon, Bairnsdale, Wangaratta,
Wodonga, Horsham, Mildura.
Because for addressing, it's the suburb/locality that appears on the
address not the city (eg. Melbourne place/city covers the whole
greater melbourne urban area, but not all the addresses here include
"Melbourne", only those within the CBD area where the Melbourne
place=suburb exists.
While in rural areas it's a locality not a suburb, the two usually go
hand in hand, and I'd say it's okay to still tag these as addr:suburb
even though it's technically a locality and not a suburb.
In this way I'd argue that addr:city has no place in Australia
(convince me otherwise).
Not sure if I want to convince you.
To me this sounds like different names for the same thing. Aren't City
or Suburb just different words for the next level up from Street?

The Auspost is referring to 'placename/suburb/locality' page 25
https://auspost.com.au/content/dam/auspost_corp/media/documents/australia-post-addressing-standards-1999.pdf

For the NSW import I recall that I settled for city=<name that
corresponds to a post code>.
Maybe for this import, where we find an address existing in OSM and it
has addr:city which matches the addr:suburb from our Vicmap address,
then we automatically swap it to addr:suburb?
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Joseph Guillaume
2021-05-19 08:40:48 UTC
Permalink
The initiative sounds good to me.

It sounds like this might be on a tight timeline, so a simple manual merge
one small group at a time would definitely not work. It seems any community
involvement would be, e.g. through a (post-import?) maproulette task. This
is therefore quite a different import than I expect Yuchen's would have
been.

Re 1. I support replacing interpolation ways because individual addresses
provide more flexibility for address placement

Re 2.,3. I've noticed someone deleting addr region tags, which is why I
haven't been adding them (in Canberra). I don't really mind either way, but
recognise that others do

I'm ok with addresses being added as lone nodes if they are not already
present. It seems like the only option for a large import.

My experience is that the position of units can be much more ambiguous in
some datasets (including being superimposed at the same coordinates), so my
intuition would be to only include housenumber and not unit number.
I see the intention is to use addr:flats ranges and addr:flats1,
addr:flats2 as needed. I don't have a strong opinion but it does seem like
this is a better solution than just dropping that data.

Cheers,

Joseph
Post by Sebastian Spiess
Hi Andrew,
indeed a great initiative and yes the NSW import has stalled way too
long.
You will also need to detail how to deal with Unit numbers. For the NSW
import there where many single houses that had several entries like 12A,
12B and 12-2 Lakewook Road. Do you import them as individual nodes? or
just one omitting A/B/2?
My comments, also based on some of my NSW import experience below in
line.
Cheers, Seb
1. How should we handle existing address interpolation ways? Should
these be left as they are or replaced with individually mapped address
points? I'm proposing we replace.
2. Should we also import `addr:suburb`, `addr:state` and
`addr:postcode` tags? I'm proposing we do.
I vote for adding the information. I have been adding it where possible.
In theory the POI should know in which State or LGA it sits but the
reality is that this does not result in the user having complete
addresses on the POI. E.g. restaurants don't have the suburb
automatically shown in OSMAnd.
I would also argue that the information is part of the full address of
the house/building. Else I dare say we should have a similar discussion
for phone numbers. We don't need to add +61 or (0)2 as this is implicit
by the POI location.
Given postcode regions aren't mapped, then adding these to the address
should be very helpful.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as
well include `addr:state`?
State and post code are part of the full address. I vote for including
it.
`addr:suburb` is similar to `addr:state`, suburb/locality boundaries
are already well mapped in Victoria. Since we have this detail from
the source data I think we probably should still include it.
3. `addr:suburb` vs `addr:city`.
Both tags are in use within Australia. According to taginfo
(
https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A)
within Australia addr:suburb occurs 521 ,671 times and addr:city
562,542 times.
The iD address preset fields uses addr:suburb.
Victoria only has a handful of place=city objects
(https://overpass-turbo.eu/s/17vc), Melbourne, Geelong, Ballarat,
Bendigo, Shepparton, Warrnambool, Traralgon, Bairnsdale, Wangaratta,
Wodonga, Horsham, Mildura.
Because for addressing, it's the suburb/locality that appears on the
address not the city (eg. Melbourne place/city covers the whole
greater melbourne urban area, but not all the addresses here include
"Melbourne", only those within the CBD area where the Melbourne
place=suburb exists.
While in rural areas it's a locality not a suburb, the two usually go
hand in hand, and I'd say it's okay to still tag these as addr:suburb
even though it's technically a locality and not a suburb.
In this way I'd argue that addr:city has no place in Australia
(convince me otherwise).
Not sure if I want to convince you.
To me this sounds like different names for the same thing. Aren't City
or Suburb just different words for the next level up from Street?
The Auspost is referring to 'placename/suburb/locality' page 25
https://auspost.com.au/content/dam/auspost_corp/media/documents/australia-post-addressing-standards-1999.pdf
For the NSW import I recall that I settled for city=<name that
corresponds to a post code>.
Maybe for this import, where we find an address existing in OSM and it
has addr:city which matches the addr:suburb from our Vicmap address,
then we automatically swap it to addr:suburb?
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Andrew Davidson
2021-05-20 10:39:06 UTC
Permalink
I'm trying to do as much of the heavy lifting, preparing this data to be able to import as much as possible automatically. Although if the community doesn't feel confident enough in this process, we could still revert to manually reviewing everything, but I'm trying to ensure a high standard so we don't need to do everything manually.
I'm all for doing as much as possible automatically, if we try to do it
manually it's going to take forever.
There's going to be some addresses which we detect as conflicts with existing data which isn't safe to automatically resolve, I've started on some code to spit these out into a MapRoulette challenge or possibly as map notes (so that surveying apps like StreetComplete can pick them up). I'm still working on this part of the code though.
It would be interesting to know how many addresses there are mapped
already (and have far they are from the Vic dataset) as that would give
an idea as to how much work there would be in resolving which one to use.
My experience is that the position of units can be much more ambiguous in some datasets (including being superimposed at the same coordinates), so my intuition would be to only include housenumber and not unit number.
I see the intention is to use addr:flats ranges and addr:flats1, addr:flats2 as needed. I don't have a strong opinion but it does seem like this is a better solution than just dropping that data.
Yeah, ignoring unit number would make the import much easier, but I have tried to include it in a reasonable way in this import.
I think that your idea is a good compromise. Otherwise we have to drop
the unit information or arbitrarily spread out a pile of address nodes.
cleary
2021-05-19 09:11:24 UTC
Permalink
I think I recall discussion some months ago about incorrect suburbs being assigned addresses in Nominatim when relying on suburb boundaries. I think I recall that most errors occurred near the boundaries rather than the centre of areas, and more often when the suburb has an irregular shape (not many suburb areas are even close to rectangular in shape). Therefore I would support inclusion of the suburb/town/hamlet in addresses to ensure accuracy.

In regard to addr:suburb and addr:city, I have always tried to match the address with the designation in OSM. So if an address is in a bounded area identified as "place=town" then I added addr:town=* or for a hamlet it would be addr:hamlet=* etc. Localities are by definition, unpopulated places so it would be unusual to have addr:locality= as almost all "localities" would be small places located within bounded areas such as hamlets, towns, etc. Bounded areas that have no shops, schools, amenities (or almost none) would usually be place=hamlet as they are identified bounded areas usually with a population. For the purposes of the import, it might be too diffiicult to separate OSM's various place classifications city/suburb/town/village/hamlet etc so I would support addr:suburb=* as better than nothing.

I would also support adding postcodes. I am more familiar with NSW than other states but there are three bounded areas in NSW named Kingswood differentiated only by their postcodes and local government areas. Similarly I know of two bounded areas in NSW named Long Plain. I think there are others but they don't come readily to mind. However I think it emphasises the usefulness of postcodes in addresses.
1. How should we handle existing address interpolation ways? Should
these be left as they are or replaced with individually mapped address
points? I'm proposing we replace.
2. Should we also import `addr:suburb`, `addr:state` and
`addr:postcode` tags? I'm proposing we do.
Given postcode regions aren't mapped, then adding these to the address
should be very helpful.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as well
include `addr:state`?
`addr:suburb` is similar to `addr:state`, suburb/locality boundaries
are already well mapped in Victoria. Since we have this detail from the
source data I think we probably should still include it.
3. `addr:suburb` vs `addr:city`.
Both tags are in use within Australia. According to taginfo
(https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A) within Australia addr:suburb occurs 521 ,671 times and addr:city 562,542 times.
The iD address preset fields uses addr:suburb.
Victoria only has a handful of place=city objects
(https://overpass-turbo.eu/s/17vc), Melbourne, Geelong, Ballarat,
Bendigo, Shepparton, Warrnambool, Traralgon, Bairnsdale, Wangaratta,
Wodonga, Horsham, Mildura.
Because for addressing, it's the suburb/locality that appears on the
address not the city (eg. Melbourne place/city covers the whole greater
melbourne urban area, but not all the addresses here include
"Melbourne", only those within the CBD area where the Melbourne
place=suburb exists.
While in rural areas it's a locality not a suburb, the two usually go
hand in hand, and I'd say it's okay to still tag these as addr:suburb
even though it's technically a locality and not a suburb.
In this way I'd argue that addr:city has no place in Australia
(convince me otherwise).
Maybe for this import, where we find an address existing in OSM and it
has addr:city which matches the addr:suburb from our Vicmap address,
then we automatically swap it to addr:suburb?
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Andrew Davidson
2021-05-20 10:15:35 UTC
Permalink
Post by cleary
I think I recall discussion some months ago about incorrect suburbs being assigned addresses in Nominatim when relying on suburb boundaries. I think I recall that most errors occurred near the boundaries rather than the centre of areas, and more often when the suburb has an irregular shape (not many suburb areas are even close to rectangular in shape). Therefore I would support inclusion of the suburb/town/hamlet in addresses to ensure accuracy.
Can you remember any examples of this? I have found that once you remove
all of the redundant information Nominatim it works quite well. Now that
they also update postal code boundaries it can also figure out those as
well. So we have both sides of Hindmarsh Dr:

https://nominatim.openstreetmap.org/ui/details.html?osmtype=W&osmid=546709043

https://nominatim.openstreetmap.org/ui/details.html?osmtype=W&osmid=667114126

which are correctly identified as being in different suburbs and post codes.
Yuchen Pei
2021-05-19 11:17:23 UTC
Permalink
I'm working with the National Heavy Vehicle Regulator (NHVR) on
a proposal to import Vicmap addresses for Victoria into OSM. For
context, here is the statement from the NHVR.
The NHVR is making a substantial investment in the creation
of a spatially based mapping tool to improve the efficiency
and safety standards of the Heavy vehicle industry within
Australia. As the regulator, we are particularly aware that
not all heavy vehicles are suitable for all roads, and wish
to ensure that heavy vehicles are aligned to the most
appropriate roads wherever possible. To support the
principals of open data the NHVR has made a decision to
utilise OpenStreetMap as its base layer map. As such, we
want to ensure that OSM is populated with the most accurate
and rich data possible. Whilst NHVR’s needs are primarily to
utilise pickup and delivery, depots and farms addresses used
by the heavy vehicle industry, we are proposing a mass
import that looks to populate both commercial and
residential addresses to support the much more diverse use
cases of the wider community, as well as our own.
We are excited to work with the open street map community
and we look forward to any and all feedback sourced from
this great community.
While my work on this is supported by the NHVR, the community
ultimately has the say on if this import happens, and how.
I've been preparing code and documentation on what I'm proposing
for a potential import at
https://gitlab.com/alantgeo/vicmap2osm/, it's still a work in
progress at this stage, it will continue to evolved based on
feedback here and as I continue to work on it. I'm yet to put
together a formal import plan on the OSM wiki, as I wanted to
get community feedback first.
I'm keen to hear any feedback for this potential import, or
answer any questions.
Joseph: I am new to OSM, was exploring the possibility of
importing VIC addresses and definitely did not have an import plan
:)

Is the reason that you think it should be done quickly to avoid
dirty reverts?

Andrew: Regarding the proposal, looks good so far. I have not
finished reading the code yet.

I agree with:
- The plan of substuting addr:city with addr:suburb when the two
matches, given suburbs and localities are used for addresses
(https://en.wikipedia.org/wiki/Suburbs_and_localities_(Australia)).
- The plan of using addr:flats for overlapping units

Regarding unit type, why not just use addr:unit for all types,
including "UNIT", "SHOP", "SUITE" etc. instead of discarding the
information?

Thanks for putting together this proposal - really well written
(and educational too)!

Best,
Yuchen
Andrew Davidson
2021-05-20 10:00:54 UTC
Permalink
2. Should we also import `addr:suburb`, `addr:state` and `addr:postcode`
tags? I'm proposing we do.
Please don't do that. Now that we have a complete set of admin_level 10
boundaries in Australia addr:suburb is now redundant.
Given postcode regions aren't mapped, then adding these to the address
should be very helpful.
Postcodes can be added once to the level 10 admin boundary or as a
separate postal_code boundary if they don't align.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
If you read the explanation on the talk page this does not apply to
Australia because we have contiguous postal areas.
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as well
include `addr:state`?
It is actually harmful because you have to maintain the same piece of
information in thousands of places. Not only that having multiple
sources of information seems to confuse Nominatim. On the other hand
adding it has no real benefit, so I would suggest that you don't import
anything beyond the street name.
3. `addr:suburb` vs `addr:city`.
The addr:city is for countries where the postal address has the concept
of a postal city. As we don't have that in Australia then we don't need
to have this tag.
Both tags are in use within Australia. According to taginfo
(https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A
<https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A>)
within Australia addr:suburb occurs 521 ,671 times and addr:city 562,542
times.
About 99.99+ % of those addr:city are from the Brisbane address import
and I would view that as a mistake rather than a reason to perpetuate
this elsewhere.
Sebastian S.
2021-05-20 10:52:37 UTC
Permalink
2. Should we also import `addr:suburb`, `addr:state` and
`addr:postcode`
tags? I'm proposing we do.
Please don't do that. Now that we have a complete set of admin_level 10
boundaries in Australia addr:suburb is now redundant.
I agree in theory but I'm practice this does not work.
I don't get the full address of a POI when checking in OSMAnd or other consumers, I only get a partial address which is not correct or complete.

We have also agreed to include +61 and 2 (area code) in phone numbers. The same argument could be applied here right?
Given postcode regions aren't mapped, then adding these to the
address
should be very helpful.
Postcodes can be added once to the level 10 admin boundary or as a
separate postal_code boundary if they don't align.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this
saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective
boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
If you read the explanation on the talk page this does not apply to
Australia because we have contiguous postal areas.
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as
well
include `addr:state`?
It is actually harmful because you have to maintain the same piece of
information in thousands of places. Not only that having multiple
sources of information seems to confuse Nominatim. On the other hand
adding it has no real benefit, so I would suggest that you don't import
Again I think this argument is misleading.
What kind of maintenance do you anticipate for post code and locality?
While I see plots of land being split and getting new a house number, the probability of street name changing is much lower. Locality and post code even less likely in my opinion.
Fabricated scenarios of such as Narrabeen and north Narrabeen getting different post codes. Unlikely I would say.
anything beyond the street name.
3. `addr:suburb` vs `addr:city`.
The addr:city is for countries where the postal address has the concept
of a postal city. As we don't have that in Australia then we don't need
to have this tag.
Thanks for this explanation
Both tags are in use within Australia. According to taginfo
(https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A
<https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A>)
within Australia addr:suburb occurs 521 ,671 times and
addr:city 562,542
times.
About 99.99+ % of those addr:city are from the Brisbane address import
and I would view that as a mistake rather than a reason to perpetuate
this elsewhere.
I tend to agree
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Andrew Davidson
2021-05-20 11:21:53 UTC
Permalink
Post by Sebastian S.
I agree in theory but I'm practice this does not work.
I don't get the full address of a POI when checking in OSMAnd or other consumers, I only get a partial address which is not correct or complete.
Nominatim can get it right so that's a problem with OSMAnd. I suggest
that you report that problem to them.
Post by Sebastian S.
We have also agreed to include +61 and 2 (area code) in phone numbers. The same argument could be applied here right?
Phone numbers should be tagged in E.123 or DIN5008 format as per the
wiki (https://wiki.openstreetmap.org/wiki/Key:phone). The same argument
doesn't apply to street addresses.
Post by Sebastian S.
Again I think this argument is misleading.
What kind of maintenance do you anticipate for post code and locality?
Localities regularly get redefined (there are usually a few in each
quarterly release of G-NAF) and mappers add their own ideas which would
have to be periodically checked if we kept anything beyond addr:street
(if we don't then they can just be scrubbed off).

So the maintenance load is greater than zero and, as I have said
already, it does not add any value.
cleary
2021-05-21 00:22:09 UTC
Permalink
In a previous post on this topic, I suggested it was important to include 'suburb' and postcode in the proposed address import because of my prior experience and discussion with other mappers about problems with Nominatim.

However, after reading Andrew Davidson's comments, I have checked some locations that I considered vulnerable to incorrect addressing. I was impressed to find that the only incorrect ones were those for which a suburb had been included in the address node but suburb boundaries had subsequently been changed so that the old (now incorrect) address continues to prevail whereas, without that tag in the node, the correct address would have been shown by Nominatim.

So I withdraw my previous comments and defer to Andrew Davidson on this point.
Post by Andrew Davidson
2. Should we also import `addr:suburb`, `addr:state` and `addr:postcode`
tags? I'm proposing we do.
Please don't do that. Now that we have a complete set of admin_level 10
boundaries in Australia addr:suburb is now redundant.
Given postcode regions aren't mapped, then adding these to the address
should be very helpful.
Postcodes can be added once to the level 10 admin boundary or as a
separate postal_code boundary if they don't align.
`addr:state` is less important given these addresses fall within the
Victoria state admin boundary already. The wiki touches on this saying
"A few mappers consider higher-level tags, or even addr:city=* as
redundant, since they could be calculated from the respective boundary
relations they are contained in (if present and valid). However, such
practice has severe disadvantages and can lead to wrong results."
If you read the explanation on the talk page this does not apply to
Australia because we have contiguous postal areas.
Either way, I don't think it matters too much, but since it's not
harmful to include, and might provide some benefit, then we may as well
include `addr:state`?
It is actually harmful because you have to maintain the same piece of
information in thousands of places. Not only that having multiple
sources of information seems to confuse Nominatim. On the other hand
adding it has no real benefit, so I would suggest that you don't import
anything beyond the street name.
3. `addr:suburb` vs `addr:city`.
The addr:city is for countries where the postal address has the concept
of a postal city. As we don't have that in Australia then we don't need
to have this tag.
Both tags are in use within Australia. According to taginfo
(https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A
<https://taginfo.geofabrik.de/australia-oceania/australia/search?q=addr%3A>)
within Australia addr:suburb occurs 521 ,671 times and addr:city 562,542
times.
About 99.99+ % of those addr:city are from the Brisbane address import
and I would view that as a mistake rather than a reason to perpetuate
this elsewhere.
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Graeme Fitzpatrick
2021-05-21 06:48:17 UTC
Permalink
With regard to postcodes how does it work when there are 2 postcodes out of
the same mail centre?

EG Gold Coast Mail Centre at Bundall is postcode 4217, but the GC City
Council, which is in that area, has its own postcode 9726?

I'll assume that this isn't limited to GC, so would that cause any issues?

Thanks

Graeme


On Fri, 21 May 2021 at 16:17, Andrew Harvey via Talk-au <
Post by Andrew Davidson
Please don't do that. Now that we have a complete set of admin_level 10
boundaries in Australia addr:suburb is now redundant.
This prompted me to see where the Vicmap suburb value differed from the
OSM admin_level=10.
After excluding some special cases, there were 62 Vicmap addresses with a
locality different to what we have in OSM. It looks like a bunch are bad
Vicmap data, most of the rest are address points practically on the admin
boundary line, there's only a small handful otherwise to deal with.
https://gitlab.com/alantgeo/vicmap2osm/-/jobs/1279647817/artifacts/raw/dist/vicmapSuburbDiffersWithOSM.geojson
My analysis supports it is mostly fine from a data consistency point of
view to rely on the admin_level 10 suburb, except for a handful of cases
(and even then it's not clear between Vicmap or OSM which is correct).
Post by Andrew Davidson
Postcodes can be added once to the level 10 admin boundary or as a
separate postal_code boundary if they don't align.
- 2912 have only one distinct postcode from Vicmap data,
- 9 have >1 postcode from Vicmap
- a handful have no addresses
Of the 9 that have >1 postcode, 7 have only 1 address with a different
postcode, 1 has only 3 addresses with a different postcode, and one
suburb/locality (Melbourne suburb) has two main postcodes 3000 with 51,458
addresses and postcode 3004 with 12,158 postcodes. Then for the Melbourne
case, it's clear than Melbourne CBD is 3000 and areas south 3004. We could
add a separate boundary=postal_code for Melbourne.
My analysis supports adding postal_code to the level 10 admin boundary is
safe for pretty much the whole state, except for Melbourne where we can add
a postal_code boundary.
I'll follow up in another email about the other points raised.
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Andrew Davidson
2021-05-24 11:14:56 UTC
Permalink
For now the iD preset doesn't show the any inherited attributes, so
it will prompt mappers to supply these fields. While it might be
redundant, it's also not wrong, would you go so far as removing these
fields other mappers manually add?
Ideally, if we agreed that anything beyond addr:street was not
necessary, we would ask the iD developers to change the AU address
format to:

{
"countryCodes": ["au"],
"format": [
["unit","housenumber", "street"],
]
}

and it would stop asking users to enter the redundant information. This
would be a literal one line change to a configuration file.
If we tolerate when mappers manually enter these, then I'd say
filling in these values is worth it, it shows the address as
complete, and prevents other mappers manually adding this information
we could have just imported anyway.
We've never had a discussion about this. Until last year we didn't have
a complete set of bounded localities, so people would have to put this
information in. Now that we've finished, people are wasting their time
if they do.
If you are relying on the level 10 boundary for addr:suburb, that
means data consumers need to know that for Australia, addr:suburb
comes from admin_level 10. In other regions it might be different and
this information isn't really stored in OSM.
The first stop for data consumers would be to look at the Nominatim
address format configuration file.

I think we need to back up here and think about why you would import
address data into OSM. If we do this import then we'll get:

1. A rendering of street numbers at high zoom levels.
2. Nominatim searches would be able to return results at the street
number level.

Both of which would be nice to have. However, if you were a downstream
data consumer, who was looking for street addressing data in Australia,
you would have three sources:

1. Extracting address data from OSM.
2. Various state government address datasets.
3. G-NAF

I think we need to be realistic here. OSM is never going to be able to
compete with G-NAF. Because importing and updating the data will take
time and effort then the copy in OSM will likely be out of date, so a
data consumer is going to be better off pulling their data straight from
the source.
Graeme Fitzpatrick
2021-05-24 23:38:03 UTC
Permalink
Now that we've finished, people are wasting their time if they do.
Thanks, Andrew, I'll stop adding suburbs as I'm updating addresses!

I think we need to back up here and think about why you would import
1. A rendering of street numbers at high zoom levels.
2. Nominatim searches would be able to return results at the street
number level.
Both of which would be nice to have.
As somebody who uses OSMAND for navigation, I'd *really* like to have all
street addresses available!

Thanks

Graeme
Yuchen Pei
2021-05-25 00:02:22 UTC
Permalink
On Mon, 24 May 2021 at 21:29, Andrew Davidson
Now that we've finished, people are wasting their time if they do.
Thanks, Andrew, I'll stop adding suburbs as I'm updating
addresses!
I think we need to back up here and think about why you would
import
1. A rendering of street numbers at high zoom levels.
2. Nominatim searches would be able to return results at the
street
number level.
Both of which would be nice to have.
As somebody who uses OSMAND for navigation, I'd *really* like to have all
street addresses available!
Hear hear. I was forced to use some proprietary apps and maps
several times because of lack of street addresses.
Thanks
Graeme
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Graeme Fitzpatrick
2021-05-25 04:05:46 UTC
Permalink
On Tue, 25 May 2021 at 12:06, Andrew Harvey via Talk-au <
From what I can see Nominatim pulls any parent place=* or
boundary=administrative area. Which is fine, even though sometimes these
boundaries or places don't form part of the postal address, it's harmless
for data consumers to display these.
Nominatim can and does correctly deal with addresses which have omitted
these tags. I tried OSMAnd and it does seem to pickup the suburb from a
parent boundary, and it seems to ignore postcode regardless of it being on
a parent boundary, or addr:postcode next to the number/street.
On this, I'm not sure if this may be an OSMAND problem, or an issue with
the way the info has been loaded in OSM, but here's something that I
noticed a few weeks ago, that may apply here?

Had to go visit a bloke for the first time, & his address is Watson Road,
Armstrong Creek:

https://www.openstreetmap.org/search?query=Watson%20Road%2C%20Armstrong%20Creek#map=17/-27.23885/152.81791

As you can see, Nominatim shows "Watson Road, Armstrong Creek, Kobble
Creek, Moreton Bay Regional Council", but when you look at Watson Road, all
it says is highway=tertiary + name=Watson Road, with no suburb / city,
State or postcode listed.

In this area, Watson Road is the boundary between Armstrong Creek & Kobble
Creek "localities"

AC: https://www.openstreetmap.org/relation/11675264#map=13/-27.2282/152.7932

KC: https://www.openstreetmap.org/relation/11675296#map=13/-27.2679/152.8018

KC has also been created as a "village"
https://www.openstreetmap.org/node/310513460, although there is no village
as such, it's just an area name.

When I search via OSMAND though, it will find Watson Road, Kobble Creek,
but won't find Watson Road, Armstrong Creek? It will also find Kobble as a
suburb, & Kobbe Creek as a village, but won't find Armstrong Creek?

So, is this a peculiarity of the way the PSMA ADmn Boundaries have been
created, or something in OSMAND?

Thanks

Graeme
Daniel O'Connor
2021-05-25 06:41:56 UTC
Permalink
Post by Andrew Davidson
For now the iD preset doesn't show the any inherited attributes, so
it will prompt mappers to supply these fields. While it might be
redundant, it's also not wrong, would you go so far as removing these
fields other mappers manually add?
Ideally, if we agreed that anything beyond addr:street was not
necessary, we would ask the iD developers to change the AU address
I'd make a polite argument there is still value in at least the suburb,
possibly postcode being still provided. When exporting data via overpass
as CSV; it's not currently easy or obvious to appropriately bring in the
parent attributes; even if it is for a Real Human looking at the map.
There's a fair number of use cases for "data in a spreadsheet
friendly format" I feel.

Yes, it does come with a maintenance problem when suburbs change or
postcodes merge, but I feel that's one problem for one set of folks - us
maintainers - vs a repeated problem for many simple data consumers.
Ideally, as maintainers, we would over time semi automated this with
tooling (much like the proposed import)
Andrew Davidson
2021-05-27 10:39:52 UTC
Permalink
Post by Daniel O'Connor
I'd make a polite argument there is still value in at least the suburb,
possibly postcode being still provided.  When exporting data via
overpass as CSV; it's not currently easy or obvious to appropriately
bring in the parent attributes; even if it is for a Real Human looking
at the map.
There's a fair number of use cases for "data in a spreadsheet
friendly format" I feel.
You don't need to add addr:suburb to get that, all you need is a little
Python.

Assuming you have a csv dump of the address points from OSM eg:

@type,@id,@lat,@lon,addr:unit,addr:housenumber,addr:street
node,34495141,-35.2641690,149.1223146,,3,Sargood Street
node,40293773,-35.2640376,149.1226107,,9,Sargood Street
node,254020381,-35.2623407,149.1451050,1,5,Edgar Street
node,291548764,-35.3847749,149.0720245,,56,Mannheim Street
node,318854867,-35.3339561,149.1697838,,289,Canberra Avenue
node,318855426,-35.3244730,149.1792480,4,59-61,Wollongong Street
node,318856277,-35.3150098,149.1417359,,19,Jardine Street
node,318859652,-35.3627241,149.0815960,,70,Hodgson Crescent
node,318859688,-35.3627835,149.0817144,,70,Hodgson Crescent
.
.
.


and you've the corresponding admin_level 10 and post code boundaries in
geojson:

act_suburbs.geojson
postcodes.geojson

then you import the libraries you need:

import pandas as pd
import geopandas as gpd

read in the address points:

addlist= pd.read_csv('act_address_dump.csv',low_memory=False)

convert the list to a geoframe:

address_points =
gpd.GeoDataFrame(addlist,crs="EPSG:4326",geometry=gpd.points_from_xy(addlist['@lon'],addlist['@lat']))


read in the suburb boundaries:

suburbs = gpd.read_file('act_suburbs.geojson')

drop all of the tags that we will not need:

suburbs = suburbs[['name','geometry']]

then do the same for the post code boundaries:

postcodes = gpd.read_file('postcodes.geojson')
postcodes = postcodes[['postal_code','geometry']]

now we merge the three data sets together with a series of spatial
joins. First the suburb names:

address_points = gpd.sjoin(address_points,suburbs,op="within")

the join creates a column we don't need so get rid of that:

address_points = address_points.drop(['index_right'], axis=1)

then join the post codes:

address_points = gpd.sjoin(address_points,postcodes,op="within")

we've now got all of the data into the one frame but we need to clean up
the column labels before we write it out, so do a rename:

address_points =
address_points.rename(columns={"name":"addr:suburb","postal_code":"addr:postcode"})

and we can then write out the columns we want to a csv file:

address_points[['@type','@id','@lat','@lon','addr:unit','addr:housenumber','addr:street','addr:suburb',
'addr:postcode']].to_csv('act_out.csv')

which gives you:

,@type,@id,@lat,@lon,addr:unit,addr:housenumber,addr:street,addr:suburb,addr:postcode
310,node,2441363738,-35.3076927,149.1333269,,7,National Circuit,Barton,2600
2280,way,564187362,-35.1539837,149.1117804,,5,Jimmy Little
Street,Moncrieff,2914
4414,way,823380125,-35.2242021,149.0456133,,55,Ennor Crescent,Florey,2615
2249,way,547120674,-35.2540932,149.1531645,,24,Piper Street,Ainslie,2602
1548,way,220316259,-35.3349388,149.0923894,,27,Coxen Street,Hughes,2605
4511,way,847394981,-35.2353182,149.0470223,,2,Diggles Street,Page,2614
3747,way,796706631,-35.2288001,149.0513507,,4,Caddy Place,Florey,2615
555,node,4214686496,-35.318041,149.1264149,,39,Empire Circuit,Forrest,2603
3280,way,776943661,-35.4468204,149.1164925,,8,Mackerras
Crescent,Theodore,2905
1052,node,7930404220,-35.1705767,149.0708312,,13,Gladstone Street,Hall,2618

I did this in an interactive ipython session, but if this is something
people want it could be easily turned into a Python script that does the
pull from overpass and writes out the file.

I did the whole country in one go to see how well it scales and the run
time was pretty much the same. Of course you can't do postcodes for
everywhere as we have put them all in yet.
Daniel O'Connor
2021-05-27 11:03:50 UTC
Permalink
Post by Andrew Davidson
Post by Daniel O'Connor
I'd make a polite argument there is still value in at least the suburb,
possibly postcode being still provided. When exporting data via
overpass as CSV; it's not currently easy or obvious to appropriately
bring in the parent attributes; even if it is for a Real Human looking
at the map.
There's a fair number of use cases for "data in a spreadsheet
friendly format" I feel.
You don't need to add addr:suburb to get that, all you need is a little
Python.
Look, as a dev as well, yes this is absolutely doable.
If this were a one click addon to an overpass query or otherwise massively
dropped the barriers for non devs, fantastic.
But at least for me personally, the folks like us that can do a data import
ALSO have the skills to handle bulk edits for maintenance, I feel we should
make it as easy as possible to use the resulting data.
Andrew Davidson
2021-05-27 11:39:50 UTC
Permalink
Post by Daniel O'Connor
Look, as a dev as well, yes this is absolutely doable.
If this were a one click addon to an overpass query or otherwise
massively dropped the barriers for non devs, fantastic.
But at least for me personally, the folks like us that can do a data
import ALSO have the skills to handle bulk edits for maintenance, I feel
we should make it as easy as possible to use the resulting data.
OK, so you already know how to do this? So when I thought I was being
helpful, I was actually wasting my time?

Cheers thanks for that.
Andrew Davidson
2021-05-27 11:35:40 UTC
Permalink
Okay, so it sounds like we have a few people advocating including the
full address tags even when they could be derived from a parent
boundary object mostly because it makes it easier for some data
consumers,
Who are these data consumers? The only two concrete data consumers we
have are OSM Carto and Nominatim. Neither of which require anything
beyond street to function correctly.

People keep throwing up hypotheticals and when I show how you would deal
with these supposed problems, the assumptions get moved and we continue
the argument.

Let me be blunt here: if you don't have a smidgen of coding or GIS
expertise you are not going to get any value out of trying to extract
the address information from OSM and do something with it.
cleary
2021-06-08 06:15:20 UTC
Permalink
Thanks Andrew. A considered and thoughtful response. I support your proposed actions. Your work for OSM is always very good and much appreciated.
To sum up the contentious issue of suburb, postcode, state tags,
- Phil, Daniel and Seb would prefer the suburb and postcode on each
address object.
- Andrew Davidson and cleary would prefer we not include suburb and
postcode on each address object and instead require data consumers to
derive this data from the existing boundaries, and actively discourage
mappers manually adding this data via removing the preset in ID.
Thinking further I'd support including the full address details on each
address object, to provide a complete address, even if duplicated by
the boundary. QA tools could be built to validate these match the admin
boundaries and it becomes a maintenance task to maintain these tags,
but I think that's okay.
However, to avoid stalling this import on this issue (it doesn't sound
like anyone will change their mind soon), I'll plan the minimum viable
option of excluding addr:suburb, addr:postcode and addr:state from the
import.
There's nothing stopping a further discussion of a planned automated
edit to update address objects with suburb, postcode and state if the
community changes their mind later on.
I'll make these changes to the import code, then once I've completed
all the documentation and remaining issues hopefully post some import
candidate files if anyone would like to review.
_______________________________________________
Talk-au mailing list
https://lists.openstreetmap.org/listinfo/talk-au
Sebastian S.
2021-05-20 10:09:53 UTC
Permalink
Hi Andrew,
Great interaction and transparency!

I have not read the code, will have a look but not sure how much I will understand. Therefore I'm asking how do you determine the location of the POI to be added?

In the NSW the address data was part of an area of the plot of land the address is for. So as part of the import process the area was converted to a node location.
Long driveways or other thin parts of the plot resulted in the node often being outside of the actual area.

Also most plot of lands have the house towards one end and garden in the other. This results in the node outside of the building.

I assume that you do not intend to manually correct node locations such that they are on top of an unit.
I've started to summarise the general consensus to some of the import
questions at https://gitlab.com/alantgeo/vicmap2osm#community-feedback.
Another thing is where someone has mapped an address with
housenumber=2/5, but Vicmap is indicating it's unit 2 number 5, we
convert this to addr:unit=2 addr:housenumber=5. This is slightly
overstepping simply importing Vicmap data, to changing existing data in
OSM (which mostly for the rest of the import, if OSM says something
different we flag it for manual review), so I'm happy to also skip
this.
However, given X/Y usually means unit/number, and then only where
Vicmap data confirms this, and given addr:unit is a widely used tag for
unit value, I feel it's probably best we do this.
Soon I'll share some links to actual maps and data that show exactly
what would be changes so we can do more QA.
Daniel O'Connor
2021-05-21 01:11:21 UTC
Permalink
Could you elaborate a bit on the sequence or chunking of the data, and how
you'd go about importing/QAing, and rough timelines?

The per suburb approach + review approach used by
https://gitlab.com/dionmoult/osm-nsw-address-import for example might be a
good level of granularity and process to follow.
Bob Cameron
2021-05-24 22:12:39 UTC
Permalink
Hi Andrew

Wonder if you might add the following to your later communications with
NHVR. Not pressing, just an inclusion.

As I understand it formal heavy vehicle rest area locations are more a
state database system. The "three green dot" informal rest areas however
were/are a NHVR initiative. If they have a (changing) data base of these
locations It might be handy if they were also imported.

Tnx Bob
I'm working with the National Heavy Vehicle Regulator (NHVR) on a
proposal to import Vicmap addresses for Victoria into OSM. For
context, here is the statement from the NHVR.
Loading...