This is the technical support forum for WPML - the multilingual WordPress plugin.

Everyone can read, but only WPML clients can post here. WPML team is replying on the forum 6 days per week, 22 hours per day.

This topic contains 6 replies, has 2 voices.

Last updated by Raja Mohammed 5 months, 1 week ago.

Assigned support staff: Raja Mohammed.

Author Posts
April 15, 2019 at 9:54 pm #3616611

julieD-6

Our site is multi lingual into 4 languages.
We use the format standexelectronics.com/de/ for German and standexelectronics.com/ja/ for Japanese and so on.

In the URL, we have language specific characters.
We have submitted the sitemap and we continue to get "Crawl Issues" for all URLs where there are non-Latin characters in the URL. Have you heard of this? Do you have a suggestion to fix it so we dont get crawl errors?

We thought that maybe it has something to do with the way we are managing rel=canonical where we are redirecting back to the English version of the page but we have many German lang pages that are working fine when they do not have non-Latin URLs. The only ones that are "broken" are the URLs with non-Latin characters.

The leading theory is that we need to convert all URLs into their UTF-8 counterpart.
But before we do anything (we have thousands of links that need to be fixed) we want to get a definitive answer to what the problem is and how to fix it.

Here is the Sitemap in question hidden link but there are 2 other sitemaps (one for German, one for Chinese)

Here are just a few examples (of the 2K plus) URLs that are being flagged with crawl issues. Notice that in all cases, the URL has non-Latin characters:
hidden link
hidden link
hidden link

I am wondering if they should appear on our site map like this
hidden link
hidden link
hidden link

And if they should look like that, do you have a tool, or button I am missing that allows me to convert the site map into that format? or is this a Yoast issue?
Will your Yoast plugin solve my problem?

April 16, 2019 at 7:29 am #3618917

Raja Mohammed
Supporter

Languages: English (English )

Timezone: Asia/Kolkata (GMT+05:30)

Hi there,

Welcome to our forum,

I have tried to access the Chinese and German URL the response is very slow and I have also encountered 503 server error, which might be the reason for Crawl error.

Crawl error appears when Google is unable to crawl the specific URL due to restriction.

Please to increase the max_exectuion_time, PHP max input vars, and memory limit.

Let me know if you need further assistance I will be happy to help you with your issue.

kind regards
Raja

April 20, 2019 at 3:02 pm #3650397

julieD-6

We have increased these 3 well beyond typical limits

For example our:
max_exectuion_time = 1200
PHP max input vars = 5000
memory limit = 1024M

Can you answer this question, "Have you heard of this situation where non-latin characters are not crawled properly?"

April 22, 2019 at 6:01 am #3653991

Raja Mohammed
Supporter

Languages: English (English )

Timezone: Asia/Kolkata (GMT+05:30)

Hi there,

I have never heard of such a situation Google will index URL with non-Latin character.

I have given the solution for the crawl issue. As far as I understand your situation is not related to Latin characters but inaccessible pages which was returning 504 error.
Now the pages are accessible try to reindex the site in google search console,

Please post the actual error and the screenshot of the error, If you are still having issues after reindexing.

Please let me know the results.

Kind regards
Raja

May 2, 2019 at 8:58 pm #3729271

julieD-6

This has not been resolved, please keep this ticket open

May 2, 2019 at 9:04 pm #3729277

julieD-6

I have spoken to our dedicated host provider and they say that they are not seeing any 503 or 504 errors.

I am back to non-latin characters. it is the one common denominator.
absolutely every URL that has crawl error has non-latin characters.

Can you ask around your team mates and see if they have any idea what this might be?
I has something to do witht he way that the site is being translated which leads us back to WPML. Not saying it is a fault of the plugin, but it is the best place to satrt looking for answers

May 3, 2019 at 9:59 am #3732851

Raja Mohammed
Supporter

Languages: English (English )

Timezone: Asia/Kolkata (GMT+05:30)

Hi there,

There seems to be no issue related to Google crawling the non-latin / non-english character, I have also consulted the same with the team.

Since you have fixed the 503 and 504 error, You might have to mark all crawl errors as fixed in google console and try to reindex the site again. If you are encountering crawl errors again please share with us the detailed Error from the google console along with the screenshot.

Also, make sure the Browser language redirect is disabled in your WPML > Langauges > Browser language redirect.

Kind regards
Raja

The topic ‘[Closed] Crawl Errors in Search Console’ is closed to new replies.