arisuchan    [ tech / cult / art ]   [ λ / Δ ]   [ psy ]   [ ru ]   [ random ]   [ meta ]   [ all ]    info / stickers     temporarily disabledtemporarily disabled

/q/ - arisuchan meta

discuss arisuchan itself. comments and questions welcome.
Name
Email
Subject
Comment

formatting options

File
Password (For file deletion.)

Help me fix this shit. https://legacy.arisuchan.jp/q/res/2703.html#2703

Kalyx ######


File: 1493813303426.png (241.21 KB, 736x561, 8c2b073c90fbd99a4e75ff91d3….png)

 No.540

As we continue our campaign to expand the community, it would be very helpful for us to have web analytics software installed on our server to provide insight into the site's activity. At the same time, we respect the rights and privacy of our users so we would like to see if there is a general consensus on whether or not we have the community's consent to use an open source analytics platform on the website.

If you would like to learn more about the platform we are considering before offering feedback on the idea, please visit the official Piwik website at https://piwik.org/privacy/. Keep in mind that data will never be shared with a third-party because there are no third-parties involved. The analytics software would be hosted on the same VPS as this image board.

We hope you consider this opportunity. Thank you for your feedback.

 No.542

If it doesn't share data with third-parties at all. I'm okay with it.

 No.545

>>540
I don't mind this.

 No.546

I feel mostly indifferent about it, which is also my general attitude towards mass data-mining, post-awareness.
(The above evaluates to sure, why not)

 No.547

Can you by any chance make a chart about the data collection that will be done?

 No.548

I am strigdly against it :DD

No, for real, I am.

 No.551

>>547
By default Piwik will track the following:

User IP address
Date and time of the request
Title of the page being viewed (Page Title)
URL of the page being viewed (Page URL)
URL of the page that was viewed prior to the current page (Referrer URL)
Screen resolution being used
Time in local user’s timezone
Files that were clicked and downloaded (Download)
Links to an outside domain that were clicked (Outlink)
Pages generation time (Page speed)
Location of the user: country, region, city, approximate latitude and longitude (Geolocation)
Main Language of the browser being used (Accept-Language header)
User Agent of the browser being used (User-Agent header)


There is an optional feature to automatically anonymize the IP address of visitors. This would be enabled on our instance of the platform. Additionally, we will configure it to respect the DoNotTrack preference set in your browser.

https://piwik.org/docs/privacy/#step-1-automatically-anonymize-visitor-ips

More information about privacy can be found here:

https://piwik.org/privacy/

 No.571

In light of the fact that there have been no objections to using open source, locally hosted analytics software, I went ahead and installed Piwik on the server. If you would prefer to not have your anonymous web browsing data collected, you may either install a third-party adblocker (e.g., AdGuard, Disconnect, etc.) or check the 'do not track' option in your browser settings. http://donottrack.us/

 No.698

>>540
>>571
The difference between analytics and surveillance starts with goals and ethics.

What are you trying to measure? The number of bits in the track list of >>551 is clearly more than enough to uniquely identify all the users here, if not all humanity.

This new lainchan with analytics you now have is clearly unethical. Not only you collect all of >>551, but you also submit most of that information to google via js and fonts. I'm fine on my Tor Browser with everything third-party off by default and most of those values you measure spoofed. But why would you subject less sophisticated users to surveillance on a board that purports to respect /cyb/ values, including, one would assume, privacy and anonymity, I can't possibly understand.

This is going to be my last informative post here until you reconsider, because now I have no other choice to continue being ethical to other lains than to be as benign as possible so that anything I say couldn't possibly damage someone else's police record. Which Google kindly collects for you right now and a guy that will hack into your server or you yourself fucking the server config up would provide to NSA later.

Somewhat more ethical setup would be:

>User IP address

Must not be recorded. Should be converted to a country code/"tor exit" with geoip and tor exit nodes list. Having an .onion would help too, btw.

> Date and time of the request

Must be rounded to 15-60 or more minutes depending on the amount of traffic to the board. No single user should be identifiable by post time.

> Title of the page being viewed (Page Title)

Fine.

> URL of the page being viewed (Page URL)

Fine on lainchan, but most of the time should be cleaned up of ids and other soykaf.

> URL of the page that was viewed prior to the current page (Referrer URL)

Must be truncated to domain name.

> Screen resolution being used

> Time in local user’s timezone
> Links to an outside domain that were clicked (Outlink)
Must not be recorded.

> Files that were clicked and downloaded (Download)

Tracking links to outside domains is unacceptable.
Local links might be ok, depending on a threat model. Consider that raiding a given lains' computer with downloaded PDFs and pics can be used to link that lain to at least some posts.
Aggregated per URL popularity is fine.

> Pages generation time (Page speed)

> Location of the user: country, region, city, approximate latitude and longitude (Geolocation)
Country is ok, nothing else must be recorded.

> Main Language of the browser being used (Accept-Language header)

What for? Must not be recorded.

> User Agent of the browser being used (User-Agent header)

One bit (mobile/desktop) is fine. Everything else is unreliable for sophisticated users anyway.
OS type (Linux, Windows, OSX, other BSD) might be ok (but unreliable).
Nothing else must be recorded.

> No third-party resources or trackers.

Obviously.

Truly ethical setup would be to just directly compute counters instead of recording any request data at all and throwing away any counters with less than 50 requests/day so that, for instance, users from Iran can't be identified just by correlating your per-country counter for Iran with some logs Iranian police already has. This is what tor daemon does for relay stats, btw.

> Countly/"tor exit" distribution.

> URL (thread) popularity.
> Load/hour/time of day.
> Referer domain popularity.
> Mobile/desktop popularity.
Are both useful to you and not invasive for lains if you respect the 50/day rule above.

 No.700

>>698
>Not only you collect all of >>551, but you also submit most of that information to google via js and fonts
All remote assets have been removed. Everything is now hosted on our server.

>Must not be recorded.

We do not collect IP addresses. We have elected to use the anonymization feature from the beginning.

 No.718

My request is a stats box. It shares the number of posts, posters, and active content . Since this information is typical on most imageboards, I'm not requesting much or violating anonymity.

 No.1809

Did we ever land on what we're going to use for our web analytics? Or are we still using Piwik?
>>718
I would also like to see some statistics on our end.

 No.1923

Please remove this. Propagating privacy, but watching us fapping and click spasms by yourself. Not nice.
Also i hope the serverlogs also have deactivated IPs.
For nginx there's a plugin to replace IPs with a unique hash which can't be traced back to the original IP, but you can still see by the hash which requests was made by the same visitor for debugging.

 No.1925

So, why? Why? WHY? W H Y ?

Seph can use this to farm money, right? Therefore I am going to assume you are now selling out our info for money, Seph, and that is the only reason you're doing this. First, you beg for servers money, while also saying 'this is the last time I beg for money, teehee i promise'. Then you beg for your lifestyle change and move to South-East Asia. Now you decide to farm your community even more? Amazing Seph, simply amazing.

 No.1926

>>1925
You must be as dumb as you look. Why?

 No.1927

>>1925
What are you on about? OP is over a year old. Where were your objections then? Also no one can "farm money" from anonymized analytics on a small website.

 No.1928

Has Seph ever published any analytics data? I'd like to see it.

 No.1929

The userbase here is tiny as hell anyway, why do you need analytics on it. This is dumb. Supremely dumb. Please leave us be.

 No.1930

File: 1529116221526.png (243.79 KB, 302x485, 116.png)

>>1809
>Did we ever land on what we're going to use for our web analytics? Or are we still using Piwik?
Yes, we are still using Piwik, but the software has been renamed to Matamo if you want to read more about it.
see https://matomo.org/

>>1923
>Please remove this. Propagating privacy, but watching us fapping and click spasms by yourself. Not nice.
I will assume sarcasm until proven otherwise.

>>1925
>Seph can use this to farm money, right?
No, I can't. Even if I could, I wouldn't. There was a period of time, long before I made this website, where I was in a very, very tough spot. Someone wanted to take advantage of this fact and reached out to me on IRC.

They asked if I would sell them the raw database to Cyberpunk Forums for an unreasonable amount of money (I don't know why they thought it was so valuable). I told them to go fuck themselves. Even when I'm desperate, I'm not degenerate.

>Therefore I am going to assume you are now selling out our info for money, Seph, and that is the only reason you're doing this.

If that is what you genuinely believe, please leave and do not return.

>>1928
>Has Seph ever published any analytics data? I'd like to see it.
No. I haven't. Is there something specific you would like to see?

>>1929
>The userbase here is tiny as hell anyway, why do you need analytics on it.
It can be helpful to see where traffic is coming from. Also I don't want the userbase to be "tiny as hell". I would like to see it flourish to the point where activity is self-sustaining. I don't think we're there yet. I think analytics can provide some insight into what we can do better.

I would also like to repeat and clarify a few points already mentioned by staff in this thread:

1. our web server, Nginx, is configured to not log anything whatsoever

2. we anonymize all IP addresses by reducing them to 2 bytes
see https://matomo.org/docs/privacy/#step-1-automatically-anonymize-visitor-ips

3. we respect the "do not track" configuration set in your web browser
see https://matomo.org/docs/privacy/#step-4-respect-donottrack-preference

4. we do not circumvent filter lists such as EasyPrivacy

5. all resources are locally hosted and therefore no remote connections should be made while browsing our website

6. all traffic is encrypted using the absolute highest encryption standards
see https://www.ssllabs.com/ssltest/analyze.html?d=arisuchan.jp (try and find a website that does better)

If, even after taking these facts into consideration, there are reasonable objections to our continued use of Matomo, please make them known here. I will respect the wishes to delete the software if a majority of the community agrees. Otherwise, if you have any other questions or concerns about our use of analytics, please feel free to respond to this thread.

 No.1931

I am against collecting that much user-specific information. I'd support it when "fingerprinting" data (IPs, user agent strings, etc) was excluded from all data collection making profiling users hard to impossible.

 No.1954

>>1931

I agree.

Matomo "Features" Include:
Geolocation
Locate your visitors for accurate detection of Country, Region, City, Organization. View the visitors statistics on a World Map by Country, Region, City. View your latest visitors in real time.
Pages Transitions
View what visitors did before, and after viewing specific page.

super lame.

 No.1955

File: 1529423524767.jpg (26.07 KB, 602x452, 1213771996866.jpg)

Interesting software, piwiki. I may use this myself.

I would like to understand what makes you think analytics would help grow/sustain the community. I am of the opinion analytics is best served with a marketing campaign in order to determine the effectiveness of the campaign itself. This is called conversion rate if this site was a business trying to attract new customers.

Past that. You should already have access to the data you need in order to determine which board is more popular. Visitor flow? Post popularity or Poster popularity? Where a majority of your visitors are coming from in order to market to a demographic better?

Please do not take these questions as accusations or aggression on my part. I am coming from a business perspective which is all I can offer in this thread.

 No.1980

>>1930
A little ignorant.
Especially since users are not just identifiable by IP but general Metadata plus Browserfingerprint, which Piwik/Matomo definitely collects.

 No.1981

How many days till humble Filipino pig farmer buys out this board?
Your bets, Arisu

 No.1982

>>1980

Then you probably know that if you're practicing good habit when browsing the internet this should not even be a problem.

I've gone to Matomo and one interesting result i found was the metadata collected actually as little to nothing to identify user. The only thing really annoying me was that they collect the name during the report but since we have randomize by default name that little to nothing.

So yes it does but so little information is almost irrelevant and can barely build a profile on you. I could be very wrong but i did not found anything showing that the metadata collected specifically try to undermine the user anonymity.

 No.1983

>>1982
think twice:
it's enough data that matomo knows it's the same user that it won't count him twice.



[Return] [Go to top] [ Catalog ] [Post a Reply]
Delete Post [ ]